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INVARIANCE OF THE GUTTMAN QUASI-SIMPLEX 
LINEAR MODEL UNDER SELECTION 


By Bisuwa NATH MUKHERJEE 
York University, T'oronto 


Тһе paper contains a proof that the Guttman additive quasi-simplex model 
is invariant under selection of examinees when a set of continuous variables 
each having different error of measurement is ordered on the simplex continuum 
in terms of either increasing or decreasing magnitudes of * complexity ' provided 
that selection is based on those measures which are less variable than each of the 
measures of the remaining set. When variables with highest ‘ complexity ' ranks 
constitute the basis of selection, then such an invariance breaks down. "Тһе 
significance of the problem is discussed in the context of personnel selection. 
In addition, many implications of the results are stated which hold good for 
various multivariate statistical techniques. 


1. INTRODUCTION 


In many personnel selection programmes it often is impossible to determine 
the validities of a set of criterion tests, уу, Js --+» Yæ for the entire range of 
individual differences since data for only those individuals are available whose 
scores on a set of predictor (selection) variables, ху, X», ..., Хр, are above some 
multiple cut-off points or above a particular aggregate score. ‘There is then an 
explicit restriction of range in the distribution of scores on the X variables, and 
consequently the standard deviations as well as the extent of covariations among 
the explicit selection variables, ху, X, ..., Xp, are affected. Any effects of such 
restrictions on the extent of variations and covariations among the non-selection 
(criterion) set, y, Js, +++» Уа are called incidental. The Y battery in this case is 
subject to incidental selection (Gulliksen, 1950, p. 131) in the sense that while 
no person who took it did not receive a position, the variability of the group 
taking the Y battery was affected by the loss of persons who could not pass the 
X battery since we assume a statistical non-independence of the two sets of 
tests. The observed correlations between the two batteries will generally be 
modified due to the select nature of the group who could pass all the tests of the 
X battery. Instances of such incidental selection frequently arise from selective 
admission situations where predictor scores are available for all the applicants, 
but criterion scores are available for only those trainees who could complete 
the training programme. Under such situations, it often becomes a necessity 
to correct the correlations (validity coefficients) for indirect (incidental) restriction 
of range. The importance of this problem in the area of personnel selection 
has been recognized by Thorndike (1949). ‘Techniques for adjusting correlations 
for change in dispersion resulting from such selections were originally developed 
by Pearson (1903). ‘The formulae for correcting sample statistics for restriction 
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of range are given in Gulliksen (1950) both for bivariate and multivariate cases. 
Horst (1966) has also examined the problem at length. Rydberg (1966) has 


given extensive consideration to the problem and has discussed a number of 


techniques for correcting for selection bias. Rydberg (1962) has also presented 
a derivation for correcting correlations for indirect restriction of range when the 
non-selection variables do not yield interval data. ; 

In order to determine the extent of influence on the variance-covariance 
matrix of Y battery when Y is subject to incidental selection, various authors, 
notably Pearson (1903), Aitken (1934), Burt (1943, 1944) and Gulliksen (1930) 
have derived the necessary formulae. Using Aitken’s formulae here reproduce 
as (14a) and (14b), one can, for example, determine in what ways the matrix С, 
the variance-covariance matrix of the criterion and predictor variables er 
selection, is different from 5, the corresponding covariance matrix for ы 
population which is Supposed to be a known positive definite symmetric matr 
of order p+q. ing at 

Consider now that the criterion and predictor variables each having B 
least an interval scale are P+ normal random variables composed of such di 
ordered series of elemental components of ability that the components gr 
stituting a particular variable аге involved in the subsequent variables as in iew 
case of certain psychological tests developed from the facet design point pee 
(Guttman, 1958; Guilford, 1956). If the concept of * complexity ' is de® 
ша Very general way in terms of the number of components that make s 
particular variable such that а ‘ complex ’ variable will involve more Чеш. 
than a less complex ' variable, then it can be said that the variables lying que 
simplex continuum are arranged in a simple order of ‘ complexity ' from 


‹ i i nt 0 
to most ‘ complex ’, Since each of such variables involves a certain qoae 
u 


ttery- 


М А t 

у specifically determine to what extent . 
d tix of the Y criterion battery (non- 
enoted by Cyy, can be expressed in the same way aS 


4 ives 
ental selection. If it is found that the same model that £ 


s 2 = ә е also 
€sponding population Variance-covariance matrix Жуу САП 


3 A m 
fos (lus m the predicted Variance-covariance matrix, Cyy, obtained f = 
In a is said to be invariant under selection of {95 

2 an A 
1955) additive iud. as been demonstrated for the Guttm 


H i i т 
а А Simplex model. The breakdown of such an invarian 
under a particular condition has also be 


predicted varian 
set), hereinafter 
before the incid 


Ce-covariance mat 
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2. THE GUTTMAN QUASI-SIMPLEX LINEAR MODEL 
In the quasi-simplex linear model, the composition of the p-variate observa- 


tions (ху, xs, ..., хр) for individual 7 оп occasion і (replication) can be denoted as 
following in the form of a p-component vector: 


Xu = TÉ; + ext, (1) 


where ej; is the vector of р mutually orthogonal random response errors which 
in the population of individuals, i, is supposed to be N(0, Г), Г being the 
dispersion matrix of the diagonal form with its typical elements y; (j=1, 2, ..., p) 
and (б, еп) - 0 whence f; is the vector of р mutually orthogonal hypothetical 
variables (factors) having distribution of the form N(u, A). Let 8; be the jth 
diagonal element of the diagonal matrix A. "Тһе model matrix T of order р is 
a square lower triangular matrix with all its non-zero elements equal to unity. 
The inverse of T is a lower auxiliary-unit matrix (Aitken, cited in Bodewig, 
1959, p. 24), say B, the diagonal elements of which are equal to unity, the first 
subdiagonal elements equal to minus one and the remaining elements all equal 
identically to zero. Eqn. (1) and the associated assumptions stated above 
clearly imply that the observed scores, Хи, will generate in the population of 
individuals the variance-covariance matrix Z, of the form: 


Z-TAT4T. (2) 


If the population variance-covariance matrix, X= {oj} (j, k=1, 2, ..., f), is 
written explicitly for a seven-variate case, it displays a pattern as shown below 
when we let 


for all j, R=1, 2, «++» P- Clearly, X of order seven can be explicitly written as 
follows: 


drt 0c оң % бі e! e, | 
о) оу 92 Qa аа оз СЯ 
o Qa ast Ys 9g 93 ез з 
z= 0 ag ag Atys ощ Oy a |» (8) 
«| Og Og €, 059 о5 a5 
о ag «з 04 as “s+ Ye % 
L «| 2 оз Qa о5 % «+, ] 


Тһе variance-covariance matrix (3) shows the well-known quasi-simplex 
structure as deduced from the Guttman additive quasi-simplex model stated in 
(1). Тһе test variables generating such a pattern possess a simple order of 
* complexity ' (Gabriel, 1954) graded in increasing magnitude in the sense that 
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the later variables involve all the mutually orthogonal components of hae 
ability or trait which the previous ones measure and in addition vs i 
component which is independent of the other components. Under ма A 
dition, it is obvious that the variance of one variable is included in the varia : 
of the next variable and this produces a hierarchical order within the үш, 
Using a likelihood ratio test which Mukherjee (1966) has derived for the x 
simplex covariance Structures, it has been verified (Mukherjee, 1963) vh 
number of empirical variance covariance matrices obtained in the area of prac Кү 
curriculum construction, personality ratings and mental testing do have em 
pattern as illustrated in (3). Guttman (1954, 1957) has also given e 
empirical examples of quasi-simplex correlation matrices. Recently Gu Р the 
& Guttman (1963) presented some evidence of cross-cultural invariance e E 
intercorrelation pattern found in a set of tests of mental abilities that cou 
ordered on the simplex continuum. 


Some Properties of the Guttman Quasi-Simplex Correlation Matrix e 
If the population correlation coefficient for variable j and k (j < k) is "em 
by руь, the population standard deviation for the jth variable by o; and the 


the 
y1le; is denoted by ej for all 171,2, ..., b, it can be seen from (3) that 
relations 


рук — = °С! _ оз—уу[оу _ 9j— е) (4а) 
оўо 0)6% оқ Ok 
and similarly 
Ok— Ek (4b) 
Pkm =“ 
9m 


will always hold true whe 
all of which are Scaled o 
j«k«m, we obtain 


; ; Бапа?» 
never ; <k <m. Therefore, for variates р pi 
n the quasi-simplex continuum having the 


(5) 


РЫ Pim = om [o y. 


‘ance 
‹ | relation matrix, р = {рук}, corresponding to yw ard 
matrix of (3) will show 4 tendency for the correlations to form a ‘ diagon a 
ridge (Burt, 1951, p. 43), i.e. the biggest correlations are next to the Ж 
wards the left and right and also going upwards 
downwards. 
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It is a necessary feature of the covariance matrix 7, as well as its correspond- 
ing correlation matrix, p, that every submatrix above or below the main diagonal 
of these matrices is of rank 1, that is: 


бт) Omk Pmj Pmk 


- 0, (6) 


бі) Onk Pnj Рпк 


when j <А <m <n. In general, a determinant of the type shown in (6) does not 
vanish as long as four of its elements are on one side of the principal diagonal. 


Indeterminancy of the Guttman Quasi-Simplex Linear Model 
It may be now noted that the covariance matrix deduced from the following 
linear model also satisfies the necessary condition (6): 

Xu, = Hf; + Beit, (7) 
where E is a symmetric matrix with units in the diagonal running from northeast 
to southwest and zero elsewhere. The matrix H in (7) and its inverse can be 
written explicitly for a seven-variate case as: 


j teats? T 0 0 0 0 0 O 1 
111114 9 0.0 000 1 =1 
1171110 0 60 € 0 1—1 7 

H= |1 11100 0|; H3-|0 0 0 1-1 0 0. (8) 
111000 0 00 1-1 00 0 
110000 0 01-1 00 0 0 
[1 0 0 000 0 1-1 00 00 0 


All other assumptions and definitions involved in (7) are just the same as 
for the eqn. (1). From the model eqn. (7) which may be called the linear model 
of shrinking quasi-simplex covariance structure, it is obvious that the explicit 
form of the population variance-covariance matrix, say 2%, for a seven-variate 
case will be 


07+ ys % Os 94 Os Og a, | 
as “+ Yg 095 94 Qs “a «| 
ж- Os о: о +у; ощ Og “a Oy 
НАН +ЕГЕ = ag СЯ 04 Atya Qg a a |, (9) 
«з аз «з аҙ “g+Yg аз o 
Qs ы] Ua e ag +у, оу 
«ү 0 «ү «ү | «| Oy 


6 Bishwa Nath Mukherjee 


A= Diag (8,, 8,, ... $5), 
T= Diag (74, уь... ур) 
and р 
2 
y= 2) дұ (RTD p 
k-1 


and these symbols are defined identically the same way as for (2). An empirical 


illustration of the covariance structure shown in (9) has been given by Mukherjee 
(1969). 


It can be now shown that 9a) 
z*-HwH, ( 


s g; 1 
where V is a Jacobi matrix (Gantmacher, 1959) and has the following tridiagona 
form for a four-variate case: 


9 y, —hn 0 0 
Ж-Н-іхен-і “BIB | ^^ 8 +у + Ya — Ya 0 . (9b) 
0 “Уа B tysky, уз 
0 0 


— ys да+ Уз + Ys oig 
Thus there is a basic indeterminancy in the generating law of quart a 
pattern (Guttman, 1954, p. 315). Mukherjee (1963) has demonstrated t à d 
number of other models in addition to (1) and (7) can yield the same d mss 
Bradient in the correlation matrix which is typical of the quasi-simplex pat 


as shown in (90): 
UY alU. for all 7 (72,3, 2) 


. 1 ы 
possible to express Ж in terms of its Gaussian factorization (Сап 
macher, 1959, yol, 1, p. 38) in the form 


с 
Y=D BD ?B'D (9 ) 
where иы. = 


D,- Ding 1, n Уа su aya | | (94) 
= % Gh' In. Uo 
рг = Diag Е 8%, Ыы bu... =] , (90) 
UU угу” VS yaoi 


in which case it is Obvious that 


¥-1D DUP DT D, (9f) 
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the explicit form of which for a four-variate case can be given as 


1, x0, xv) | уйуйуі эту Xx | эзуу Ys , Эз 3 
LOL ыгы US Һа ° bills тамда Sb 1008 Азап 
Mi. Ys уау 1 2: уг? ic ys ys? Ya | Yay? УзУз 
bibs аба GUUG LU. Uu аба BU, Zababa 
үз | УлузУз* ү Угуз? A oe EN 

аз Got DU GU ba UU [eran 
1723 Ys E En 

Db Gl [m а 


Now using the notations employed in (9c), the Z matrix as defined in (3) 
can be written as 


z-TYT' -TD,BD,B'D,T (10) 
which for a four-variate case will have the following form: 
Ju 61-71 о-у Өзу 
z- oY Ou ta2 2Y1 933 — Уз 953 — Уз (102) 
бітуі бар — Уз ба 03 2У2 933— Уз 
91 —0A боз- Y2 біз” Уз Gag Jas — 2Уз 


where оу) and уу are the jth diagonal element of Z and ¥ respectively. Now 
it can be checked that the tridiagonal matrix 


ü = 0 0 
w-BZB' = =y, tot) 98 0 (1%) 
0 -y2 ta + (2/62) —Ys 
0 0 —Ys b + (valts) 
can be uniquely factored (Roy & Roy, 1958) as 
Ч=ВЫВ'=тт', (10c) 


where т is a lower triangular matrix which for a four-variate can be written 
explicitly as 
tt 0 0 0 
ы bat 0 0 
жш У16 1 2 (104) 
б met GH 0 
0 0 узбзі М 


From (10c) it readily follows that the determinant of ¥ is the same as the 
determinant of Z as well as 2% and it will be given by 


|v|- |H-: z* H2|- |BZ B'|= П = (5) (Ea) .. (5). (11) 


(92) 
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The inverse of Z will be therefore given by 
zZ- =B’ D, TD, T D, B, (11a) 


where the various diagonal matrices are the inverse of the diagonal matrices 
already defined in (94) and (9e). 


3. INVARIANCE RESULTS 


Although the quasi-simplex pattern appearing in the correlation matrix can 
be generated by a number of models, the additive quasi-simplex models, both 
(1) and (7) are invariant under one condition of selection. It will be now proved 
that model (1) is invariant under selection of subjects only when the first J 
variables of an ordered set having the relations x, <a < ... «xp constitute the 
basis of selection. Similarly, model (7) is invariant under selection when 
subjects are selected on the basis of their scores on the last j predictor variables 
which are ordered in terms of decreasing ‘ complexity ' on the quasi-simplex 
continuum that starts with criterion variables having higher ‘ complexity ’ ranks. 

In order to show that the after-selection variance-covariance matrix for the 
non-selection variables, say Суу, can still be expressed as in (2) when the sample 
of study has been Selected on the basis of their scores obtained in Х battery 


under the condition x, <x <х,< : : we first 
partition Z of (3) as 1 78 AGS e сар a «Xy yp «yas. «yn 
z- E- | Es, ] 02 
Жуз | Жуу 


where Z,, is a 
of individuals fr 


3 
Х<х< + ®#р-1<Х%р<у,<у„<... <Уа—1<Уа- p : 


те... у now be noted that due to selection of subjects on the basis s id 

ry, there occurs а Change not only in the values of Zzz, the amen 

election (predictor) variables, but also in other par 

as been empirically shown by Thomson (1948). 

ariate selection formula, these changes resulting 
battery can be expressed as 

Czy = Cas EQ £y Cyl, (142) 


Cy- Eyy— Zys(X.41— EC Soa Xs (4) 


here C,,— T 
sively w= {cir} and Czy denote the changed form of Zy, and Жу, гезрес 
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Now remembering that the inverse of the triangular matrix T is a lower 
auxiliary-unit matrix, B, we immediately find that BEZB'—A-BIB'- as 
shown in (95) and therefore Хуу and 2 can be expressed as 

Хуу-Т,(Ау--ВуГВу)Ту-ТуР,уТу, (15a) 
Ig =Ta(Ar + BaTB4’)Tz — TESTS, (155) 
where Ту is a triangular matrix as defined before of order 4 xq and Yyy is a 
non-singular transformation of Жуу and has a typical Jacobi (tridiagonal) form. 
The matrices T; and Чуу, respectively, are defined in the same way. Now it 
can be verified that Жуу as appearing at the lower left corner of (12) can be 
expressed as 
Zyx=TyZycT x’, (16) 
where Жул is a matrix of order q x p with elements 8,, 85, ..., Sp, in the first row 
and the elements іп the remaining q— 1 rows of it are all identically zero. The 
transpose of Zyx will be hereafter denoted as Zzy. 
Using (15a), (150) and (16), we сап write (145) as 
Cyy= TET, = TyZyeT x (Zu = У: 7С: 5а: Т, ау Ту (17) 
- TET, = Т,7,11. Dar (Baa x Cra) Err TT ау Ту 
= T,[ yy ES 7ухТ Ba Yas Ba(Zaz үе Cag)Ba Чха !ВьТхЁгу]Ту' 
=Т,[ уу 7,1% Balazs = Cra)Br Yre Zay]Ty (18) 
since B; — T47!. Letting Жа !Ba(Zza— Cua) Ba Рас! = (W;;—Ms) а 
symmetric matrix of order p where Ma = {myx}, we can rewrite (18) as follows: 
Cy, = T, [V yy = Zyl Yaa — M3)Zzy]Ty'. (19) 
But any matrix which is pre- and post-multiplied by a matrix like Жул, whose 
first row has non-zero elements and the remaining rows are completely null, 
will always result in a diagonal matrix, say Dy, the first element of which is 
non-zero and equal to say 
X X 8350/8 — тук), 
іті кеі 
and the rest of it is filled with elements which аге identically zero. Thus 
expression (19) turns out to be 


Cyy= T,(Y y, —Dy)T, 
= Ty(Ay B,T,By -Dy)Ty 
= T,(Ay* - BT,By)T/ 
= TyA,*Ty' +Ty, (20) 
which is essentially of the form stated in (2). This proves that we are able to 


express the variance-covariance matrix of the non-selection variables in the 
same way as before, in terms of the model matrix T which underlies the Guttman 
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НӘРІН i the 
additive quasi-simplex model(1). It may be noted in P Rp x 
only effect selection has is reflected by the change of d de sot (Сый, 
matrix, i.e. the variance component corresponding to the n. P бш preme 
1950) of the criterion variable, Yı the rank of which on oe ree ow 
scale is next to the selection variable, x». The result implies a га bles Буза 
variance and covariances of each of the non-selection (criterion) bue es 
constant amount of difference. If we now define c asa known sca spec of 
equal to the first non-zero element of the D, matrix and 1’ asa fte. eril iy 
order 1 хд having all its elements equal to unity, then our resulte rA E 
prove that even under selection of subjects on the basis of ее slug 
first p simplex variables ordered from simple to complex, eqn. (145) v 
reduce to: Q1) 
Cy, 7 Zy,— 011". | bé 
Using (20) and (9f), one can obtain the inverse of Cy, which may 
expressed as 
Cw? Ty Yyy -Ву)1т,ғ 
= B/(Y,, Dy) B, 


= B/(D*-1T,' D,s-: T,D,*-)B,, (214) 
where d 
Da = Diag] 1 7, goe Dee] 
2 = F 
4%-4-о. 


Therefore the in 


verse of the corr 
be expressed as 


n 
; Р : Суу cà 
elation matrix, Ry,, corresponding to Cy 


(22) 
Ry = D; C7 Dj, 


onal. 
where Бо, 


From (22) it can be furth 
selection v; 


in the int 


Е iables, У, ` 
ns among the non-selection variables, tion 


a о? . (23) 
(os + у)(оу+ 8141+ a+. + S8 yx) 
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If we now let 

k 

X ба+ук=ёь 

a=j+1 
and then divide both the numerator and denominator by оу, the numerator and 
denominator of the right hand of eqn. (23) turn out to be 
А 1 
Ра = (++ Exo’ 

where v; is equal to y;/aj- When eqn. (21) is explicitly written for Ryy, we 
immediately find that the square of the predicted correlation between variables 
j and Ё after selection can be expressed as 


ГЕНМЕН, S (235) 


(ч-өзуХч-өзіі)” 
Dividing both the numerator and denominator of eqn. (235) by (о/- w)”, we get 


(23c) 


(23a) 


3 


1 
"T = 5 
È = Treg + eC N 
Since w> 0, it is obvious from (23а) and (23c) that pyx> jk. Therefore it 
is proved that explicit selection on the basis of the first p variables of X battery 
leads to a general reduction in the magnitude of correlation between any two 
variables of the non-selection set, Y. Using the same procedure as followed in 
eqns. (23) to (230), it can be similarly demonstrated that explicit selection on the 
basis of the first p variables having the lowest ‘ simplex ’ ranks only results in an 
attenuation of the correlation between a non-selection variable and a selection 
variable. 
4. A HYPOTHETICAL EXAMPLE 
For the purpose of illustration, a hypothetical example can be given below 
in which the variables may be ordered as in the quasi-simplex continuum 
satisfying the relation expressed in (13). Assuming that the parameters of the 
variance-covariance matrix 2] as expressed in (10a) have the following values: 
5-4, £,-9, {,=16, 5-25, $5236, &=49, 

апа 

972 уа-3, уз=4, у= 5, y=, у= 7. 
We can immediately construct the matrix 7 and write it below in the partitioned 
form as shown in (12). 
2 212 2 2 


ij 717 9% 7 


M 
з 
а 
M 
а 
= 
о о gd dO DS + 


7 

7 

7 17.34 66 60 
: | 
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It can be checked that the submatrix Ел. has the following inverse 


161 8001 

576 258 96 

14 20 2 

Жыз =|- A “и 
1 2 1 

96 7% 16 


В " . iei ion 
Assuming that the matrix of sample covariances among the explicit selectio 
variables, x, x, and Хз» is of the following form: 


3 0-17 
Ca = О 9 Sh, 
=1 5 21 
it immediately follows that 
1 2. 3) 
Ха: = Cz = 2 1 2 
3 2 0 


. B 1 ix of 
For the given hypothetical data, the predicted variance-covariance matri 
the Y set of non-selection variables will be given by 


е Cyy = Ху – Ey Zu7(2x.— Сә) аа "Жау. 
Since Ж 1 turns out to be 


—66 96 432 
уай e 1 —66 96 432 


576 ' 
| —66 96 432 
It can be checked that 
39 34 34 1 1 1) 
Cw = 34 66 60| 38960 1 1 1 


| 34 60 104 1 1 1 
which can be expressed as 


1 0 0 fir 290 -5 0| |1 1 1 
C, = (576): 

110 -5 37 —6| |011 

1 1 1 0 —6 50| |0 0 1 

* C0) Чаш û dd mp 4 45. re uo 
-|110 9 26 0| от 1|4|0 6 0 

ASTEM T 0 0 1 007 
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The above expression of the predicted variance-covariance matrix, Cyy, is 
similar to that given in (20). The hypothetical example clearly demonstrates 
the invariance of the Guttman quasi-simplex linear model under selection of 
subjects when the .X battery is subjected to explicit selection and the variables 
in the X battery in relation to the variables in Y battery (non-selection variables) 
satisfy eqn. (13). 


5. BREAKDOWN OF ÍNVARIANCE 


When instead of the first p simplex variables, selection of subjects is done 
on the basis of the last д variables graduated in terms of increasing amount of 
© complexity ' and the first р variables of X battery constitute the non-selection 
set, then it is not possible to express Czz, the changed variance-covariance 
matrix for the non-selection variables, as (2). Іп order to show this, let x and y 
be replaced by y and х, respectively, in eqns. (14), (15а) and (16). Under this 
changed condition, we find that the equation corresponding to (19) becomes 


С. EXE = Z; (y! = M,)Zyz] Те» (24) 


where all the symbols are already defined. Now it can be verified with reference 
to (3) and (16) that 


Z,(W y — My)Zys = my 8 8, (24a) 


where m is the first main diagonal element of My and 8 is a p x 1 vector with 
elements $, 85, ..., 5p which are not necessarily zero. "Therefore the matrix 
shown at the left-hand side of (24a), i.e. Zz,(Y y^! — My)Zys; will not have the 
resultant diagonal form. Ав such, the possibility of expressing Cas in the form 
similar to (2) does not exist when selection is based on Y battery and the x and y 
variables satisfy condition (13). Consequently, the variance-covariance matrix 
Ca; will not have the Guttman quasi-simplex pattern under the above condition, 
since it fails to satisfy both (5) and (6) as a result of this type of selection. This 
will be true even when the sample variance-covariance matrix of the selection 
variable, i.e. Cyy, is a null matrix resulting from a complete homogeneity in the 
sample taking the test of Y battery. 

We now consider a situation where the predictor and criterion variables 
obey (7) and satisfy the following relation: 


A> Xi e >®Ўр>уу>ўз> ++ >74-1>20 (25) 


'The variance-covariance matrix of the variables shown in (25) will have the 
pattern as in (9) when they obey the model of shrinking quasi-simplex covariance 
structure given in (7). Now suppose that all the p tests of X battery constitute 
the basis of selection. Under this condition it may be shown that the variance- 
covariance matrix for the non-selection variables after selection, say Cyy*, 
cannot be expressed in the same way as (9). In order to show this, we first 
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" t ix 
rtition Z* of (9) in the same way as (12) and then prove that the submatrix 
a 

E of (9) can be expressed as 


: Хаз-Н,А,б,, | uM С 
where Ну is a symmetric non-singular matrix of order q qon Dp و‎ 
and is of the form shown in (8). In (26), Ay is a diagonal о real ie 
defined before, and Gy, is a singular matrix of order qxp hay ing B. nce 
equal to unity. Following (9a), (146) and (26) we now -— wei Ferdi! 
covariance matrix of the non-selection variables уу, yo, ..., yq after 


Cyy* - HV, H,— HjA,Gy,Z,,*-1(2,,,* — қыса. р А - 
-H,[V,, = АСК ДР UR, = СаоК,ғ ІК, Сух ul y (27) 
=H, (Ly, — Avgy;Usgy; А,)Н,, 


; first 
where GyzKz= gya is а matrix of order 4 x p all of whose "e pre (27) 
column are equal to unity and the rest of the matrix is null. We also ). Alge- 
the identity Vas "Kal Eze" —Coa*)Kz¥ 22-1 = U, with elements S an БЕ 
braic manipulation of the.last term in the parenthesis reveals that ( 
further reduced to 


28) 
Су”-Н,(Ф,-и,А,с,,А.)Н,, Е. of 
where ш, is the first element of the matrix U, and Gy, is a es indi the 
order 4 having all its elements equal to unity. The last term of ie cked. AS 
parenthesis is definitely not a diagonal matrix, as can be readily c lus follow 
such, it is not Possible to express Cyy* as (9) when a set of at variables 
relation (25) and the generating principle (7) provided that the first р the basis 
having larger variances than any of the remaining variables constitute 
of selection, 


"T he last 4 
When instead of the first p variables of (25), selection is based on th as 
variables which form y bat 


the tests of X set, there 


: A 2 even 
occurs an invariance of the equation (9) 
selection. To Prove this, 


ү інопе 
. 5 Е " s partiti 
the Variance-covariance matrix £* of (9) is 1 


in the same way as (12) in which case the submatrix Zz,” of (9) will be "y 
following explicit form when p=4 and q=3: 
оз «ә о 
хы 93 а о, gn 
ay" = з 
Us % 6% 
Xe as а, 
It can be now checked that ау" as shown in (29) can be written as (292) 


Жау? == H;5Z;,,*H,, 


= er 
А ord 
Б + 5 of А 
y are symmetric non-singular T aget Zay” із 4 
f the same form as shown in (8), and Ашу, о 


i 
2 emain 
nts 01, 52, ..., 8g in its first row and the r 


where the matrices Hz and H 
р and q respectively and are o 
matrix of order D xq with eleme 
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(р-1) rows of it being all filled with zero. Using (29а) and replacing the term 
x by y and conversely y by x in (14b), we can now express the changed 
variance-covariance matrix for the X variables as: i 
Cae = Ж* - Н. уН (Хуу 7 Хуу Сууу? 1) HyZya Hz 

= Ha[V xz = Zaz QyyLyz*\Hz, (3 0) 
where we let H,(Zyy*3 — Zyy* AC yyZyy* Hy = Quy, a symmetric matrix of 
order q with elements {gj }. It can be now verified that any symmetric matrix 
such as Оуу when pre-multiplied by a matrix like Zzy* whose first row is filled 
with non-zero elements and the rest of it is completely null, and then the resultant 
matrix is post-multiplied by the transpose of the same matrix Zz,*, the product 
will be essentially a diagonal matrix, say D;*, the element of which in the first 
row and first column will be non-zero and all the other elements will be identically 
zero. If this is true, then it is obvious that (30) reduces to 


Сал жек Ни” а et D;*]H; (3 1) 
= H,[A, + B;T;Bz —D5*]Ha 
=H,[Az* + BaT Bz Ha, (31a) 


where Az;—D;*=A,*, a diagonal matrix of order p with elements 
8, +d*, 89, ..., 8, in its main diagonal. Now it can be checked that if the 
matrix Hg as defined before is post-multiplied by the matrix Ву, the inverse of 
the triangular matrix Tz, then the resultant will necessarily be Ez, a symmetric 
matrix of order p with units in the diagonal running from northeast to southwest 
and zero elsewhere, as is defined in (7). We can therefore write (31a) as 
Cz; H5A;*H; -E;T;Ez, (32) 
which essentially is of the form similar to (9). It is thus proved that when a set 
of variables ordered as (25) obey the generating principle (7) and give rise to the 
Guttman shrinking quasi-simplex covariance structure as shown in (9), selection 
of subjects will have essentially no effect on the covariance structure of the non- 


selection variables, provided subjects are selected on the basis of their scores on 


the last 4 variables, the variability of each of which is less than the variability of 
the remaining variables. 


6. DISCUSSION 


Тһе proofs advanced in the above sections clearly show the invariance in 
the structure of the Guttman quasi-simplex covariance matrices arising out of 
both increasing and decreasing components models as presented in (1) and (7) 
respectively, only when selection is based on a set of ordered variables arranged 
from low to high in such a way that the variances of each of these is less than the 
variance of any of the ordered variables constituting the non-selection battery. 
Тһе only effect of the selection of the types mentioned above is to attenuate the 
intercorrelations among the non-selection variables and to reduce the extent of 
correlation between any variable of the non-selection set and another variable 
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f the selection set. "Thus if selection is explicitly based on ا‎ 
least to somewhat complex and the variance of any of these е. 
one wine ared to the variances of the non-selection set, then such а se 947) 
б cM the canonical correlations (Hotelling, 1935: а 
ле the selection and non-selection sets as shown in Cr vrl which 

It has also been proved that when selection is based on w^ ears emu 
are more ‘ complex ’ than the non-selection variables in terms of t “ Шы lt 
rank on the Guttman quasi-simplex continuum, the invariance in = epi 
structure of the non-selection battery totally breaks down. Un e abt 
dition, it can be demonstrated that the correlation matrix "erp mem eet 
non-selection battery will not show after the selection the diagona 
tendency since it completely fails to satisfy (6). Р — Же 

Therefore, when Selection leads to a breakdown of the e dlê 

statistical tests developed by Mukherjee (1965) on the likelihood uel en m 
are inadequate for testing certain multivariate hypotheses under the asi-simplet 
that the Population covariance matrix, Z, is of the Guttman Ennii na 
Pattern. These multivariate tests have been shown to be more powe "oet 
ess laborious than the conventional multivaria 


е 0 
: : Iructure 
6 no assumptions are made regarding the stri 

the population covariance matrix, 


of the model generatin 
recommend that select; 
Variances when a set o і 
of increasing * comple 
In order fully 
industrial Set-up whe 
battery of mechanical 
Only persons having 
and they commence 


Jv Уз» ..-› Уа, are obtained from th 
the variables from X, 


хібу”. А іпе 
z . magin 

to appreciate the above recommendations, а a take а 

re the entire population of applicants is require 

tests, хі, x», . 


sake 
: Қ 2 ө it are арро! 
а particular multiple cut-off score or above it are арр 


that а 

ex 
ple 
the 


ll 


3 w 
e selected workers. Suppose em 
at different distances on the qua Fus 
‘complex’. To the extent 


The results reported ipai ші 
on the quasi-simplex wap from 
в 9 0р Уһ Уф +» Ait selection 
Dia qs 5%) У» Cp Epis ..., Vis explicit sJation 
alter the structure of the intercorre matrix 

The covariance matrix corresponding to the ily th 
of intercorrelations among the non-selection variables still shows essentia 


eve 
eee : й 
Guttman quasi-simplex Structure and can be expressed in the same way 


| 


| 
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after selection as before. "Тһе only effect of selection is to attenuate the inter- 
correlations among the non-selection (criterion) variables and between Y 
variables and any other variable from the X battery. This, however, will have 
no bearing upon the application of various statistical techniques developed 
particularly for the quasi-simplex (Mukherjee, 1965, 1966) or simplex data 
(Gabriel, 1962). If a multivariate analysis of variance (MANOVA) of such types 
of criterion measures obtained from several industrial groups is attempted, it 
could be more conveniently done under the assumption of a population 
covariance structure which is either of the form shown іп (3) or іп (9). As is 
shown іп the Appendix, the computation of multiple correlations, partial correla- 
tions, canonical correlation as well as the test of the mean vector, turns out to 
be relatively simpler for quasi-simplex data. 

But when explicit selection is based on the tests which are more variable 
than measures treated as criterion, the effect of incidental selection on the 
Criterion set is much more damaging in the sense that the whole structure of 
the correlation matrix for the non-selection set takes a different form and no 
more shows the Guttman quasi-simplex pattern. Asa result of this type of 
incidental selection, the predicted intercorrelation matrix obtained for the 
non-selection (criterion) set following either (24) or Q8) ceases to show certain 
Interesting implications for prediction problems as is true in the case of the 
inverse of a simplex correlation matrix or even а quasi-simplex correlation 
matrix such as the one obtained following (22). If the correlation matrix has a 
quasi-simplex pattern, then it can be mathematically proved that the matrix of 
multiple regression weights for predicting each variable from the remaining 
Variables of the correlation matrix should show larger elements being concen- 
trated near the main diagonal, and as the elements move farther from the 
diagonal, both vertically and horizontally, they should tend to vanish. In 
addition, the off-diagonal elements of the inverse of the correlation matrix for a 
Set of quasi-simplex variables should be all necessarily negative. This implies 


that the non-neighbouring variables do not play as important a role as do the 


neighbouring ones in terms of their efficiency for predicting a particular variable 


of the set. 2 
When variables ordered on the higher side of the * complexity continuum 
rithout applying the likelihood ratio 


Constitute the basis of selection, then even W j 
test developed by Mukherjee (1966) one can confidently say that the predicted 


Covariance matrix for the non-selection variables following (24) or Q7) will fail 
to show the Guttman quasi-simplex structure after selection. Application. of 
Some of the multivariate tests developed specifically for a set of variables which 
Obey the Guttman linear quasi-simplex model (1) will be highly inappropriate 
under such a condition. 2. | 
. . On the basis of what has been said above, it is concluded that when selection 
18 based on certain tests from amongst the set of variables that are ordered on the 
quasi-simplex continuum we should choose those variables which have relatively 
Smaller variances, Іп that event the predicted intercorrelation matrix for the 
S.P. 5 
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non-selection variables will still retain the Guttman quasi-simplex pattern and, 
as such, all the properties useful for prediction and for development of more 
convenient multivariate tests which are associated with this pattern can be 
fully exploited even if a set of such criterion variables has been subjected to in- 
cidental selection. 

When the selection battery is composed of tests having variances larger than 
each of the tests in the non-selection battery, and both the selection and non- 
selection sets are ordered on the quasi-simplex continuum from complex to 
simple, the results clearly demonstrate the breakdown of the invariance. But 
there exists one possibility of avoiding the breakdown of the invariance discussed 
above. This involves transformation of the unit of measurements of each of the 
tests in the selection battery by such a constant that the variances of the trans- 
formed variates turn out to be smaller than the variances of each of the non- 
selection measures. However, in such a case of experimental manipulation of 
the test-statistics (Gulliksen, 1950) for the selection set, we will have to eliminate 
that selection variable which originally had the largest variance in the whole set 
of tests. After transformation of the measuring unit by a constant through 
appropriate manipulation of test length and/or scoring procedure, all other 
selection variables will have a simplex rank lower than each of the ordered non- 
selection variables on the * complexity’ dimension, and since it can be shown 
that the pattern of quasi-simplex correlation matrix is also invariant when the 
unit of measurement of all the variables is changed by a particular constant 
(multiplication or division of a particular ordered set by a scalar number ), the 
recommendation submitted in this paper can be applied to any situation whenever 
в setor simplex variables each having different error of measurement generates 4 
уапапсе-соуагіапсе of the form shown either in (3) or (9) and any of the 
sequentially ordered subsets from such a battery constitutes the basis of selection" 


APPENDIX 
Computational Ease in Calculation of Multiple Correlations f 
When a set of variables obey the Guttman quasi-simplex linear model кс 
the form of either (1) or (7), the calculation of the squared multiple correlatio 
turns out to be quite simple. Assuming that the entire population variance 
Covariance matrix, Z, of the form shown in (3) is known and the jth diag 
term of X is denoted by oj; and the symbol 2: ; stands for the cofactor of off je 
Z, it can be shown following Anderson (1958, p. 32) that the squared multip 
correlation of the jth variable with the remaining p+q—1 variates, ху «o Xp" 1 
Уд is 
=| (33) 


where c; denotes the su 
(11), the determinant of 
tion variance-covariance 


jn 
bscripts 1, ..., (j— 1), G1»... (pq) Ав dd 
z matrix can be easily evaluated as soon as the PoP act 
matrix, 2, is uniquely factored in terms the pr? 
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of a triangular matrix and its transpose using (10c) and the values of the 
parameters, Č}, бр, ..., Gp are determined. In order to evaluate the determinant, 
ЕД, for all ;—1,2, ..., р, the same type of transformation is needed. It is 
obvious from (104) that 

[z.;| - [BZ.;B'| (34) 
is equal to the products of all ё)(ў=1,2,..,р— 1) only when the last variable 
is the criterion variable for the multiple regression problem. From this, it turns 
out that the squared multiple correlation (SMC) between the last variable and 
the remaining p—1 variable will be given by the following formula only when 
the variables are ordered on the quasi-simplex continuum. 


á Hin 
D^5.(1,2, ...„р—1) ср |.| 
ЕБ bababa «+ bp 
сру(1бзбз ^ Oy) 
-1- (фр[орр). (35) 


For criterion variables other than the last one, the formula for the determination 
of SMCs is not so simple. Nevertheless, they can be easily calculated by the 
following recursive formulae for a predictor set composed of four tests. 


2 
P“i42,8,4,5) = 


Те (6161641 : 
i Ys" z 2 
oes (Yas = 25) (a рас 26 r3) (Hes Та лал) 
(36a) 
Рр зл у= 

ашан FARA | 

Уз? _ 74 
ЖЕГЕ (P уыт, = o» 7 аа ығы Em) 
(36b) 

5,4465 
2 = پک‎ aT 36 

Ана саа + s 23) Wss- Dra 1а + {-2у3)1) (АҢ 
Us (36 d) 


2 = — ے‎ ін” 5а ім 
P41, 2,8, 5) l съ 21) 


Each of the above formulae can be easily generalized for any number of 
Predictors as long as they are ordered on the quasi-simplex continuum. | Тһе 
Accuracy of these formulae сап be easily checked by a triangular factorization of 
the matrix of jth principal minor (-1,2 ..., p) in the form suggested earlier 
іп (10c). For example, in a six-variate regression problem, the evaluation of 
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|Z 4| will be necessary for calculating the multiple correlation between Е 4 
and the remaining five predictors and for this purpose, we need to express the 
minor as 


Jn =з 0 0 0 
mE 7 -—Vs 0 0 
54-| 0 —Yas Фаза — Уз 0 (3?) 
0 0 —Ys {5+0 2уд -ys 
0 0 0 — Ys Vac 
which can be written as 
I.e heal hal (95) 
where 
бі 0 0 0 0 
=y% D 0 0 0 
v-| 9 -ма d 0 0 ШЫ) 
0 0 ~a (ы+—2у0% 0 


0 0 0 6—2) [oo ve? 0 4-0 7 2) 


H 1 RE 
Using the above method of factorization, the jth principal minor of апу ord 
can be evaluated without 
SMCs for any number o 


Ж s 1. 
quasi-simplex linear mode T 
З : alues have been determined, the matrix of ташар 
regression Weights р хр for Predicting each variable from the remaining P 
variables can be determined from the following formula: 


В=рр-1 = 
* . | 1 p 
P is the matrix of Population intercorrelations of order p, D is a diagonal m? 
of order p x b with elements 0) 
H " 
D-Diag(1— pia, 1— Po 095 +++) 1— pp? ср), th 
Where су denotes all the variables of the intercorrelation matrix excep qw 
опе as defined in (33). Тһе matrix of regression weights has the fo 
explicit form in а four-variable case 
| d -Віз — fis —Ви 
В = “Ва 1 — Bog — Вз #9 
— Ba — Вз» 1 -Paa 


“Ва — Bia — Pas 1 
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and deleting the diagonal element in (41), the jth column vector p—1x1 of 
standard partial regression weights can be expressed as 

B;- Pj? V; (42) 
where Р; is the predictor variable intercorrelation matrix of order p—1 when 
jth variable acts as the criterion and V; is the column vector û —1 x 1 of correla- 
tions between variable j and all the remaining p— 1 variables. "The inverse of 
the population correlation matrix, P, showing the quasi-simplex structure can 
be easily obtained using the factorization mentioned in (9c). Thus the calcula- 
tion of the regression weights for predictor variables as many as six or seven 
following either formula (41) or (42) does not appear to be so prohibitive as is 
true for the general case when no assumptions are made as to the ordering of 
the variables. 


Computational Short-cut in Canonical Correlation Problem 


For showing how computational labour in the calculation of population 


canonical correlation between the predictor set, X, and the criterion set, Y, can 


be saved when the population covariance matrix, 2, has a Guttman quasi- 
or (9), we consider again 


simplex covariance structure of the form of either (3) 
the matrix Z as partitioned in (12). In the general case, the problem of canonical 
correlation as developed by Hotelling (1935) is reduced to finding the q roots, 
Ay (j 1, 2, ..., ф where g <p), which are the squared canonical correlations in the 


determinantal equation 
(83) 


|5,4 ал 22у = Ауу la; 0, 
ghts associated with Jy. 


Where a; is the column vector of criterion variable wei, 
The equation has a non-trivial solution when 


[5,221 Day — AZyy|=0. (44) 


In order to simplify (44), we once again express Хуг as in (16) and use the 


identity (10) to obtain 
za-(TYT?)? —p' Y- В. 
These two factorizations make it possible to write (44) as 
д T; Ba а ?BaTaZayTy — AZyy|=0 (45) 
Where T, is a square lower triangular matrix of order p having all non-zero 


*lements equal to unity and has its inverse which has already been referred to 


аз Bz. Тһе characteristic equation (45) therefore reduces to 


| IT (Zu; 2a ву — А ByZ,B')Ty | -0. (46) 
Since T, is positive definite and W yy =ByZyyBy’, we can also write (46) as 
[2,29 22 7 Zu — AL yy|=0 (47) 


or 


(48) 
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Now recalling that Zy; is a matrix of order 4 x p whose first row has non- 
zero elements equal to à;, 5, ..., 5p, and the remaining rows are completely 
null, it is obvious that if the matrix Ҹ.;-115 pre-multiplied by Zz and then post- 
multiplied by Zzy, it will always result in a diagonal matrix, say D,**, the 
explicit form of which will be 


ES 5H 0 0 . 0 
j-1k-1 
Zy; V Z.-Dy** = 0 0 0 .. 0|. (9 
0 0 O = 0 
It therefore turns out that the characteristic equation (48) can be written 45 
[Dy**¥y,-1— AI|=0. (50) 


Since the first diagonal element of D,** is non-zero, and the remainn 
elements are null as has been shown in (49), it follows immediately that m 
characteristic equation (50) will have one single root and this will be precise y 


4- ( $i 33) (11) = ул). 6n 
j-1k-21 


еп the criterion set and the predictor 864» 4 


me lection variance-covariance matrix for the 
quasi-simplex variables of the form (1) be represented by 


and (144) 
batte 
` inant?l 


where the submatrj " Р 
respectively. (crues and Czy are obtained following (20) 


canonical correlati A*, between the 
and the Y batt : Б ion, say A*, ‹ 
equations: attery will be Even by the solution of the following determ 


52) 
ССС, дж Cyy|=0. 
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Now expanding Czy in terms of (14a) and similarly its transpose с, 
eqn. (52) can be written as i 3 7 


or [5,522 1СттСхт "С.Х: Ezy —A* Cyy|=0 


[5,2222 Cartas Ezy —Л* Cyy|=0. (53) 


If the submatrices Zyz and Zzz are now substituted for the expressions 
stated in (16) and (150) respectively, the first part of the left-hand side of (53) 
can be then written as 


TZ, T; B; Yas В СЫ В V zz BaTzZayTy’ 
which, by virtue of the relation T;'B;'—I;, reduces to 
TyZ y; 22 ВСВ V oz Lay Ty! 


or the following still more simplified form if we let е В СВ Fag "= Mz, 
а symmetric matrix of û xp order as defined in (19): 


TyZy;MzZayTy - (54) 

Using (54) for the first term of (53), the determinantal equations (53) can be 
Written as 

(55) 


|Т,27,М0гуТу” = МӨ?) -0. 
Expressing Суу as in (20), (55) turns out to be 
[TZy; Ms ZsyTy — лу Dy)Ty'|=0 


or 
| IT, (£j May = V (Ey D)))Ty 1-0. (56) 


Рге- and post-multiplication of the above terms within the 

B, and B,/ respectively yield 
[Zys Ma ay = X ,-Dy)l =0. (57) 
p non-zero elements in the 


determinant by 


Now recalling that Zyz is a matrix which has its і 1 
first row and all the other rows of it are null, it is obvious that any symmetric 
matrix such as M; when pre-multiplied by 2ут and then post-multiplied by 
Zzy will produce a matrix of order 4Х4 which is essentially diagonal in nature. 
Such a diagonal matrix, say Dm, will have all its elements equal to zero excepting 


the first one, as in the case of Dy. Thus it is seen that (57) reduces to 
|р» — МЕТ М) |-0. (58) 


a diagonal matrix having all its elements equal 


Letting (Dm + A*D,) =Dn* 
to zero ped the am нн which could be denoted аз 4%, (58) further 
reduces to 

[Dn* - A*,|-0 
or 
[p,",— A*I|=0. (59) 
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The determinantal equation (59) is of the same form as (50) and as such 
the only latent root of it will be given by 


M = упав 


= ф Қт + Mo), (60) 
where 


о=У У 884000 ту) апа m= У, 38 ату, 
тук being the element of the jth column and Ath row of the symmetric matrix 
M,-YB,C,B,W,-1 From this it readily follows that 
т-і-о), (61) 


Where 7 is as given in (51). Putting the value of m as found in (61) and expanding 
(60), one gets the desired solution 


А ng — ул, + Jn As 


РЫ = Jy = Jo(1 = A*). (62) 

Now recall that the squared canonical correlation between X and Y Luce 
before selection was equal to Şn as shown in (51). Thus it is proved 

A¥=A—yMe(1 — да), А 


Since Д® is а coefficient and yn 
from (63) that 


or 


i iti : i lows 
® 1$ a positive number, it readily fo 


«айыда, (64) 


“оп and 
Where the left-hand term stands for the canonical correlation after selection 


à : gst duals 
the right-hand term for the canonical correlation in the population of individ 
between X and y battery before selection. 


Tests on Mean Vectors with Guttman Quasi-Simplex Covariance Matrix 
Ifarandom sample of N observations each іп р dimensions, i.e. х1, X» "vector 
has been drawn from the multivariate normal population with mean thesis 
в өк» known non-singular covariance matrix X of the form (3), the od po 
that the mean vector parameter y, — h ified value, 8 
= m as a specifie 
аз shown below Peli cos a» и 


Ay: p= 
can be tested by the statistic VEN 


= ж 65) 
f х= МХ Uo) Z(X — po) А 88 
which under the null hypothesis has the chi-square distribution with f oe at 
of freedom (Morrison, 1967, Р. 129). The null hypothesis, Hy, is accepte 
the о level of Significance if 


Y xxi, 


and it is rejected if the Statistic (65) exceeds y? p. 
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It is possible to avoid inverting the 2 matrix by expressing x? of (65) as 
the quotient of two determinants. For this purpose, it may be convenient to 
form the (p+1)x(p+1) matrix of the following form 


ЖЕРДЕ, Г 
Ni(X—po) B 
and expand its determinant as 
IF|- — IZ-- NX а) - Bo) (66) 
Since Z is non-singular, we сап also write 
F|=|=|- |-1-N&-— o) 2X- о). (67) 


Since the second term is a scalar number, we can write (67) also as 


iF|- -IZI + NX wo) 270 — 09] 


or 
iF|- —|2|П + x] 
or 
1+ = - Е 
2| 
or 
TN 4. (68) 
x= Bl | | 
Now substituting in (68) the value of F as derived in (66), we obtain 
2 Iz NX о) 0)! _ 1. (69) 
хе 


Letting N (хХ— Lo) =O, а simplified expression for (69) can be given as below: 


‚_Ё+ФФ] 1 (70) 
x= OE 
Now recalling that B is a non-singular lower 
diagonal, it is seen that an alternative expres 
ЕСІЛГЕН - 
а: OO 
gular matrix т is defined a 


triangular matrix with unity in the 
sion for (70) will be given by 


i d lettin 
Using (10c), where the trian s in (104) and letting 
Вф=Е, eqn. (71) can be written as 

es 8| | 
E шс =а 
№ = Tea] 
ет ктт аа _ 1 
ШЕР КЕС مي‎ 
Е [el 
k+% xr | zl; (72) 
= rl 
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where x= 77E. From (72) it is readily seen that the expression of chi-square 
in (71) reduces to 


х= [Ex x'|- 1. (73) 
Now it can be checked that the determinant 
Е І х 
Kewl- _* 
= -(07(-1-х x) 
=1+и' x, (74) 


where x’ is a 1 X p vector with elements Кі, Ka, ..., кр as defined earlier. There- 
fore, eqn. (73) finally reduces to 


? E 5 

×= к к}? (7-1, 2, sap) (7 ) 
j=l 

and the explicit expression for кз can be written as 


ку = ы/а 


(ны 


Kp = (+é " арш) / Z 


% Аф 
zi Ys 727 717273 + 
ы (6 for + for L'*UE d / E 


and more generally 


ky = (+ + 6,371 
-1 


a uas 
U s [OM 


Угуз... Ууз Уу YI) Гуз (76) 
d gaa eU s ы Ea ES Ji | 


“ӘҘ. In the above expressions, 
==.) 
& = ($14 4j) 


for all j=1, 2, . 


"К-и, Xj pps 1 
» А ‘otica 
for all j=2, 3, vp. Itis clear from the foregoing derivation that the statistic 
test of mean vector under th 


H i ce 
) € assumption of Guttman quasi-simplex Mc 
matrix reduces to the calculation of chi-square shown in (75) by obtaining 776) 
of squares of ? number of xj according to the general expression appearing in (/0/' 
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Computational Ease in Partial Correlation Problems 
Letting 
D, «= Diag(Zyy — ZysZiaz !Zay) =D, (77) 


it is easy to derive from the partitioning (12) the matrix of partial correlations 
between the q variables of criterion set Y holding all the variables of predictor 
set X constant. The covariance matrix of the conditional distribution, i.e. the 
covariance matrix of Y set given X set can be written as (Anderson, 1958, p. 80) 


Zyy a — Хуу 7 Dyas “Жау. (78) 


Therefore the matrix of partial correlations in the population can be written 
as 


Pyy. s =D>(Zyy— 7,2: X), (79) 


where the —} power of D indicates that the diagonal matrix as involved in (79) 
consists of the square roots of the reciprocals of the elements of D. 

Now using the same method of simplification as was involved in (17), we 
can write (79) as 


Pyy. =DIT y(Pyy— Zy; V 22 1Z,y)T, D> 
-D-T,(V,, -D,**)T,'D-t -D2(Zy,— 1119р-і, (80) 


where D,#* is the same diagonal matrix the explicit form of which is shown in 
(49). Тһе derivation shows that in partialling out the effects of all X variables, 


there occurs a reduction in the variance-covariances of Y variable just by a 
constant. 
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LINEAR HYPOTHESES IN 2ха FREQUENCY TABLES 


Ву R. 5. RoDpGER* 
University of Sydney 


Ап approximate method, with decision-based error-rates, for testing linear 
statistical hypotheses (oki) about a independent binomial parameters (dj) is given 
which is more general than the usual methods for partitioning Хы? from a 2xa 
table. Parameters in Table 2 may be used to choose sample-size (лп) for stated 
alternatives to ox; and power (or expected proportion of rejections). The pro- 
cedure may be used for planned tests (y, —1) or post-hoc tests (v,2a—1 or v= a) 
and in the latter case the probability of rejecting 7 of the oki (r=1,2,...,) can be 
directly calculated for any statement of what is true in the populations. 

The results from sampling experiments are used to examine the adequacy 


of the approximation and compare the method proposed (Го) with two others 
nethods have a power advantage under 


(Gv, and Zy,). Although the latter m 

some circumstances, their experimental results for n 30 аге generally less close 

to ‘ theory ’ than those of Ru,, and they are always arithmetically and conceptually 

more complex. The results from these experiments suggest that the Rv, 
of smaller expected fre- 


approximation is adequate when 2210 and the row 
quencies (пф.) is greater than or equal to 1. When the фу differ and the true фу 


approach 0 or 1, ' end-effects ' appear but these diminish as 7 increases. 
These methods can be used in a 2x4 generalization of the 2x2 median 
test. If this is interpreted not as а test of population medians, but as a test of pro- 
portions of population distributions to the right of M, (the value of the common 


sample median), which is фу in population j, it is distribution-free. 


1, INTRODUCTION 
h of a populations are classifie 1 а 
Proportion of observations in class K in population J 1 
binomial parameter of population j. Linear statistical hypotheses about фу 
(j=1, 2, ..., а) may be tested either by a method for planned tests or by a method 


for post-hoc tests given here. method has an approximate 

error-rate of the first kind о and of the second kind 1 = В. Тһе proportion of 

null hypotheses (к) rejected by the method for post-hoc tests will be, approxt- 
and Ep when spect 


mately, Еа when the oxi are true fied alternatives are true. 
0 5 d M" . 

Тһе sample size 1 required to discriminate alternatives 10 oKi for a given В or Ep 

сап be calculated by using а table given here. Тһе closeness of the approxima- 


tion is demonstrated by the results of sampling experiments and compared with 
that of two other procedures. The error of approximation is generally smaller 
for the post-hoc method proposed than for these other two procedures and, 
although the power of the proposed method for post-hoc tests (Вә) under some 


Dalhousie University, 


das either K or K. The 


Observations in eac ; 
s фу, 50 that фу is the 


*1969 Killam Senior Fellow, Halifax, Nova Scotia. 
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iti 18 less than that of one of the alternative procedures (Gri), its epe. 
ое and its simplicity recommend it. ЛЕ Е паг == 

і і < ог К, depending upon w : 
wie uei ioa Ls mk e of all lee (Me), ¢; is the 
اس ر‎ А to the right of M, in population j and tests on id bei * po» 
out as before. This form of generalization of the simple me шыны gag 
tribution-free. Тһе method Proposed in this paper for testing con 


i r 
i i imilar available fo 
proportions in a two-rowed contingency table is very similar to one a 

means in an analysis of variance context, 


2. LINEAR HYPOTHESES AND ALTERNATIVES 


. ; е of 
Observations in a independent populations are each classified nee ana Е 
two classes, K or К, yielding a 2 x a table. The proportion of пел in 
population j (/-1, 2, ..., a) which are in class K is фу and the ta the фу is 
class K is 1-4; (0&4; < 1). The ith linear Statistical hypothesis abou 
okt: Саф, + Capa +... + caba + dı =0 
and an alternative to okt is 


1 


(2 


t: саф + Ciapa Teo Ciapa + d= бі. 
Another alternative to 


Generally the $; may be th 


there are further restricti 
Тһе value of 9; is a real m 
taken here to be [9.(1 


ыы ы M %4%,-1-(020 (1—4), 
ақы: Ф+ф»— 1-- [0-29 (1 -ФОҺ 


око: b—4,—0, 163: Pha = [0:24 (1— 4 Jf, 
20а: фф, ~ 10:24 (1—4 рв, 

oka: $a—0:5=0, 1642 $5—0-5— [0-14 (1.9, ун, 

а: %-05- _ [0:19 (1—4 p. 


ee 
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The signs on these expressions are the signs assigned to 5; The relevant 
constants for these hypotheses are 
cj =1, 1, 0; = —1; и,=2; w,=0; ô= *[029.(1—9.)]*, 
Cog =1, —1, 0; dy 0; us —1; we 1; 8,= + [026.(1 — 9). 
су-0,0,1;4,--05; иҙ-1; w=0; 85— + [0-19. (1 — 9.5. 
When, for fixed 7, У су =0, the hypothesis is a contrast and, if one су=1, 


2 Е " . 
another cj; — — 1 and all other су — 0, the contrast is a comparison. Hypothesis 


ок» above is a comparison. 


For п observations from each of the a populations, a quadratic form of oki is 


A= (z 2x22] / Eaf-0, (3) 

and the same function of үк is à | 
A= (> cud; + a) [х ci - Аф (1 — 9.) (4) 
The quantities ôi mr A; are олсе parameters, the relationship 


between which is 
Аг=п5219(1- Ф.) си (5) 
1 
If h hypotheses are chosen for a proportions (dj), the су for these may be 
written as a matrix pCa. Мо matter what the value of Л, the rank of „Ca cannot 
exceed a and, if contrasts are the only hypotheses considered, the rank cannot 
exceed a—1. If% hypotheses, for which „Са has rank less than һ, are asserted 


to be true, the assertion contains either contradiction or redundancy. The set 
—0, for example, has a=4 and 


тере -p= d -ф)-(Фа-%) 
E Tie = "s d | n that these three contrasts are true 
contains redundancy—the truth of any two implies the truth of the PE If 
their values аге 8, 8, and 8,, there is redundancy when ô= ô, — д, and con- 
tradiction otherwise. Assertions of statistical hypotheses, 1n a single scientific 
report, which involve contradiction are not uncommon, but they are clearly 
inadmissible. Methods exist for testing many more than а hypotheses, but 
assertions of truth must be restricted to hhypotheses (h<a@ generally andh<a-1 
for contrasts only) for which „Са has rank h, if contradiction and redundancy 


are to be avoided. н a 
Тһе interrelations among hypotheses may be described by the matrix 
nCa aC? 


and when this matrix is diagonal, the hypothe 
Vector mg is implied by ^? hypotheses with coe 
а= da? (Ca аСһт) „Са. 
The elemen ` and the index Т is used for a transpose 
ts of nd; are the d; and the 
the elements of > aci) be allowed to fall beyond the range -1 
these elements uio the values implied for —фу by all oxi being true. 


ses are mutually orthogonal. 
ficient matrix rank №; this is 

(6) 
. If£hz-a, 
to 0, since 
If h-a-1 
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contrasts, the elements of ута are the values implied for ф-ф) by С еш 
true and must not be allowed to fall beyond the range —1 to 1. : сабын 
ditions аге the further restrictions on the values d; may take, mentione 
i.e. the d; must satisfy А 
nd, = „Ca am,? 3 
with the elements of mg between —1 and 0 when h=a and between ed а> 
1 when h=a—1 contrasts, The examples of кү given above yield wot 
(-05 —0-5 -05) when eqn. (6) is used. The hypotheses ф, + Qa + Pg — 
ф-%Ф.-09-0 and ф-2%-ф-1-0 each have di between -ш » nes 
but eqn. (6) gives 1ma —(—0-217 — 1:167 - 1:117), with elements beyon: hee 
permitted range. A more extreme example is the same hypotheses i 
di— —3, 1 and 2. In this case the first hypothesis can be true d 
Фу=ф»=фу=0 and if this were true ¢3—$,+1#40 and 4,—246,+¢,+2# " 
The most popular hypotheses are contrasts with d; 20. In such cases | 
is the null vector and satisfies the restriction applied to eqn. (7). дей 
When decisions have been made, as a result of statistical tests, about ien 
hypotheses with coefücient matrix rank h, the vector 19, may be defined n 
have elements 8; for іі accepted, — à; for „кү accepted and zero for oki pisi ; 
The decisions made imply values for the фу which are the elements of ‚фа given БУ 


8 
a= By T (Ca «Сл?)-1Һб„ — ma. | > 
If h=a—1 contrasts are used, then the right-hand side of eqn. (8) yields valu 
of $;-4.. 


Under conditions of 
class K (the 5j) are maxim 
same conditions, a consta 
the maximum likelihood, 


simple random sampling, the sample proportions с 
um likelihood, unbiased estimators of фу. isset 
nt n observations being drawn from each population, 
unbiased estimators of oki and „Л; are 


(9) 
= eb, + ciapa + ... + Ciapa + di 
and 


2 2 " (10) 
Li -(x суру + а) Хай- (z Cifj + ndi) [" У cij, 
j j j j 
in which f is the freq 
Define jk, to hay 
rank h, define 


uency in class K in sample j i 
; н ving 
€ elements ki, then for һ-а-1 contrasts with „Ca һа 


11) 
Ly- nk, T(nCa «С, 7)-14k,. ( 
If the contrasts are mutually orthogonal 12) 

i ; Ln= Ly. 4 Li 
Lm 18 not only a ‘total’ of а-1 contrasts, it is also the maximum value a 
which any contrast can yield for given Ma. If all d;=0, the maximum V? 
of 14 is 

Lm= 


= 13) 
SSB =, x (ы-рУ)-а (= сий) / 2 Cm" 


I лана 
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which is the corrected sum of squares between proportions, as in analysis of 
variance, and cmj=pj—p, with p, being the mean of the a values of p;. If all d 
are not zero and ту (j=1, 2, ..., a) are the elements of Ma, let y; be the sam js 
proportions in class К and redefine фу=уу+ ту, then eqn. (13) still holds T 
this case the 5; are the departures of the sample proportions from the pattern 
implied for them by the / contrasts. If h=a hypotheses, with Са havin, 

rank h, are used, then eqns. (11) and (12) are equivalent to е 


Lm = USSB =n pr =n (z РБ ст), (14) 
j j j 


in which ру= уу+ m; and ¢mj=pj. The ру are now the departures of the sample 
Proportions in class K from the values of фу implied by the a null hypotheses 
and o is equivalent to an adjusted but uncorrected sum of squares between 
samples. 


Apart from the restrictions on the values d; and 8; may take, and the 


variance definition (Ф(1-9.) in the population), the system outlined above is 
identical to that which applies to linear hypotheses about the means of a normal 
variates (see Rodger, 19670). Тһе method proposed for testing hypotheses 
about фу, given in the next section, is also almost identical to that for testing 


hypotheses about the means of normal variates. 


3. HYPOTHESIS TESTING 


The test procedure recommended here is to reject any oki for which 


Fi Liv (4 — 0.) > Еа)?» 9. (15) 
The quantity F[Ea]; ур © is an abscissal value of the central variance-ratio 
in the numerator and у= 90 in the 


distribution with degrees of freedom vi 
denominator, i.e. it is chi-squared divided by its degrees of freedom (v). 
his distribution between rF[Ec]; vı, co and 


Define 7, as the area under t c 
(r+1)F[Ea]; ур со (r=0, 1, 2, 94 and when r=" (т +1)= оо). Ea is then 
defined as 

(16) 


Ы 
Ea= У тт 
т=0 


co for Ех-0:05, 0-01 and 
co differ consider 


Values of F[Eo]; v» »,—1 to 8 are given in 
Table 1 below. Values of F[Eo]; v» ably from the values of F 
TABLE 1. VALUES OF F[Eo]; v», © 
E 3 4 5 6 7 8 
"05 . 37 1:247 11155 1:082 

3:841 . 1:839 1:551 1 
001 6:635 EA 2-994 2:516 2219 2019 1:872 1-761 


Ex = 1 


y= 


e of Fa; ур v has an area o to its right and 
al to F[Eo]; v» v» only when y, =1; since 


Е[Ех]; vy ¥2 Was proposed as a test statistic 
6 


Usually tabulated. The usual valu 
к * to its left, and this is identic: 
e 1-а, r= a and Ea=%- 
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xima- | 

hich would yield decision-based error-rates by Rodger (19670). а approxi 1 
tion to Е(Еа); ә, va is Fa. v, which is tabulated in Rodger (19 Va - 

i The decision rule given by eqn. (15) employs F as an арр did 
exact sampling distribution of proportions, but this fact will be н охна 
section and results reported as if they were exact. The test's appro: 
nature will be discussed in the last section of this paper. -—-—— 

If h—a—1 contrasts, each with di— and С of rank Л, are estim: 

then, for Lm given by eqn. (11), 


(17) 

Xn? — Ly[p (1—5.) | 

is the popular chi-squared statistic for contingency tables, i.e. (18) 
Xn? — У (o — е)? /е, 


Where o and e are the o 
If the contrasts are mut 
in eqn. (15), then 


ies i table. 
bserved and expected frequencies in the E 
ually orthogonal and their values of L; are su 


19) 
Xn? =, > Fy Xn? + 
Thus а-1 mutually orthogonal contrasts, with d;—0, partition Xm 
independent parts, 
If the contrasts do not have d; —0, of if h= 


kh 
a hypotheses with „Са of rank 
are used, then the value of Жый ine 


аг 
qns. (17) and (19) is по longer the rd 
chi-squared statistic for contingency tables, but the sampling distribution 


argue : "terns -centra | 
is still, approximately, а chi-squared distribution, central or non 
depending upon whether or not all oki are true. 


„qn. (15): 
Two strategies exist for testing / hypotheses, with 4C, of rank Л, by eqn- ( 
In the planned tests strate 


efi 
› the hypotheses are chosen independently br 
and each tested by eqn. (15) with v, =1. The joint distribution of the Л is the 
of Ей, in this Case, is multivariate normal with covariance matrix which 1 
matrix of correlations among hypotheses and mean vector with ишене any 
which may be negative, zero or Positive quantities. "Тһе probability in this 
Particular pattern of detection of true hypotheses is the hyper-volume "Әсіп 
distribution which is bounded by h pairs of hyper-planes—the pairs Ше! t° 
mutually orthogonal to one another, Each pair of hyper-planes is para дуа) 
all axes but опе, which is intersected by this pair (i) at — 00, zy; 5% LH, 
OF Zi qu 00, depending upon whether „кү, oki Or ук; is presumed to be de vhi 
The values of Zia and 2, ,. are abscissa] values of a unit normal € : 
have areas to their left зо and 1—1o, respectively (z, ,. = [F[Eo]; 1, sd ia de 
The post-hoc tests Strategy has all d;= , but m may be specifie وراو‎ 
pendently of the sample data, Preferably before these are collected, and ап 
carried out on р; =Y; +m;, the y; being the sample proportions in class ts 2010 
are the elements of ima. Eqn. (15) uses vı =a (or v, —a—1 if only contras™, js 
considered). The maxim; г 


i 
i А um number of mutually orthogonal hypotheses 
possible to reject by this Procedure is 


4-- [Fm|F [Eo]; ғы CO] <r, 


i 
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in which the square brackets indicate truncation of the fractional part of the 
ratio and 

Fa —Lnl|vsp (1 —.). (21) 
If all j«; are true and q, given by eqn. (20), of the v, formal decisions are 
rejections, the expected proportion of rejections is Ea. If v, mutually ortho- 
gonal ox; are false and their true alternatives have equal Aj, the distribution of 
F is the variance-ratio distribution with v, 00 degrees of freedom and non- 
central parameter Am="At- Define the area between тЕ[Ео]; v, © and 
(1) [Ев]; vy, о (r=0, 1, 2, о and when r—», (7+1) = оо) in this dis- 
'. The expected proportion of rejections when q is used is 


tribution to be пг. 
now Ef, given by 
ва Зу] en 
т=0 


=1 to 8, Ea=0-05, 0-01 and ЕВ=0:5, 


Values of A[Ef]; ғу o0 =Am/1 for v, 
Table 2. Тһе expected proportion of 


0:7, 0-8, 0-9, 0-95, 0-99 are given in 
TABLE 2. VALUES or Д[ЕВ]; 72» o0 FOR Ех-0:05 (ROMAN) 

AND 0:01 (ITALIC) 

3 4 5 6 


3-151 3:383 3:627 3:874 
5672 боо 6491 6:924 


EBw- 1 2 7 8 


0-50 3:841 2:962 2:847 2:955 
6:635 5:215 5:096 5:327 
0:70 6:170 4626 4379 4-488 4:729 5-028 5:351 5:683 
9605 7475 7:292 7592 8052 8595 9178 9782 
0:80 7:851 5:792 5:418 5:487 5:721 6:029 6:370 6:726 
I1:676 9:009 8:712 8:990 9465 10945 10678 11'340 
090 10511 7602 6981 694 7129 7414 779 8107 
14:876 11°321 10:768 107945 11:382 11958 122606 17297 
8:321 8:564 8-872 9:214 


8:206 
15483 — 14111 14793 


12:50 12945 
099 18370 12736 11206 10719 10.643 10763 10985 11263 
24034 17576 16932 үне 90 шәй 10082 07 2409 


095 12995 9257 9370 
17:808 13:378 12537 


rejections will be at least ЕВ when € mutually orthogonal hypotheses are 


equally false if n and ô satisfy 
AE]; v, co SIP) а. Q3) 
he three oki given in Section 2 all yield A= 
For Ё«=0°05, ЕВ=095 and »-1 (for planned 
130 satisfies these conditions. Should 


Тһе alternatives to t 
n8j|8 (1—4.)y, ci = 017 
tests), "Table 2 gives 12-995 «010550 7 
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91-4.) have its maximum possible value (0-25), then 8,—8,—(0-05)* and 
5, = (0-025), which is quite fine discrimination. 

In an investigation with a= 5 in whic 
if A; is set at 0-2, Ew at 0-05, ЕВ 
for v4—4; so n=42 may be used. 

Generally, reasonable discri 
alternatives will be attained 


h only contrasts will be tested 5575 
at 0-95, Table 2 shows that 8-321 < 


mination between null hypotheses and Lips 
if 0022 « A; <0-5и. This range corresponds m 
02[$ (1 —$.)] < 8; < [9 (1 ~¢$.)} for а comparison and, were ф(1—ф,) at 
maximum possible value, this is 0:1< 8; «0-5. айыны он 

It is always possible to find 4 mutually orthogonal gi; for rejection p шетен 
and опе set of hypotheses which leads to fairly simple results is that i n 
there are q values of Е-Е, lq and v, —q values of F;=0 when eqn. (15) E anal 
When the 8;, су and my (the elements of Лал) for this set of mutually orthog 
hypotheses are Substituted in €qn. (8), the elements of 19a аге 


9-16 NB]; vy, coJUSSBy m p 


or, for contrasts only, 


a 5 
%-%.-(р-р o (1 7$ )JaA[EB]; v,, оо /85В] — my. 09 


of 
proposed by Rodger (196 potheses about the means 
normal variates, Argum 


UA aper- 
-hoc tests are given in that рар 


4. THREE METHODS 
Castellan (1965) 


in 
shows Psychological applications of a method of Irw 
(1949) and Lancaster (1949, 1950) fo 
contingency table of a 


> ny order into а 
table the interaction Xm? is 


Lob T8 d 

Partitioned into а-1 parts, each of eee 

component for а-1 mutually orthogonal contrasts, each with d, =0. nts 

the my (the column totals) are equal, these contrasts have (Са with 2 el ie 
equal to 1, one element equal to гапа; elements equal to zero in ro 

(121,2, ;4—1). An example for a=4is 
1 0 =1 0 
6) 
a, = |1 0 1 -2 P 


Бола! polynomials for i 
› Sums of squares’ (p. 4 6). 
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'The components extracted by Irwin, Lancaster and Kastenbaum from the 
exact partition of the interaction Xm? can be shown to be identical to 14/2 (1-р) 
when a 2 x a table with equal лу is analysed. The method proposed in eqn. (15) 
differs from the Irwin-Lancaster-Kastenbaum scheme in using F rather than X? 
it places no restriction on the type of contrast or linear hypothesis which шу 
be estimated, it makes special provision for anticipating the proportion of oki 
which may be rejected, whether окт is true or false, in the context of planned tests 
(with v, = 1) or post-hoc tests (with v, =a or v, —a — 1 for contrasts only), but it is 
restricted to two-rowed tables. 

Irwin, Lancaster, Kastenbaum and Castellan do not discuss the control of 
error-rate in the tests, but Keith Smith (1966) gives psychological examples 
he Kastenbaum scheme with explicit reference to an 
rate of the first kind, from the Scheffé (1953) viewpoint, 
in contingency tables. It has been argued else- 
basis for error-rate choice is 


of the application of t 
© experiment-wide ' error- 
for * posterior comparisons d 
where by Rodger (1965, 1967 a, b) that the proper 
that in which error may be made, i.e. decisions. This view led to the proposal 
by Rodger (19675) that F[Eo]; v1 v2 and A[Ef]; ә, v; are appropriate to tests of 
linear hypotheses chosen post-hoc rather than the Га; и, уз and (rarely) 
AB; vı, vg in current use. From this point of view Keith Smith's test statistic 
should be v,F[Eo]; үр © and the various procedures discussed by Marascuilo 
(1966) for testing contrasts chosen post-hoc should also use this statistic rather 
than the X2K — 1(1— o) (—v;Fo; v © with », = K — 1) proposed. 
Goodman (1964) gives a method for judging contrasts among mu 


populations, using 


Itinomial 


Үлі- X (org = "ibi lon (27) 
ы . 

d frequency in class k of population j and рр® is 

ic means of the proportions 


f the weighted harmoni 
In a 2x a table with equal 7j, this 


in which ору is the observe 
calculated from a function 0 
in the various classes of the sample data. 


statistic reduces to 
2 
Ym? =n (z empi) /х спр —pi)r (28) 
j j 


« as the probability that all contrasts 
his procedures may be modified for 
based error-rates to the rule that 


Goodman uses 1- 
ue, but 
d decision- 


where сау = (у —.)- 


will be accepted when they are tr 
linear hypotheses in 2x 4 tables an 
oki is rejected when 


2 
Fi=a( geast а) [nocet =2 


=1(s айғай) [etn f) FI © (9) 
j 2 


ests and v; —4 (or v, 
central parameter 


with n= ned t —a-—1 if only contrasts are con- 
кі ten pian for each of а set of mutually 


sidered) otherwise. The non- 
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rthogonal hypotheses which gives an expected proportion of rejections Ef is 
È 2 

2 30 

АТЕВ]; >, ө-а(х ауа) /х ci h(l -= $5) (30) 

я 4 ° 29 cO] © 

when the decision rule in eqn. (29) is used and а= [Үл F[Eo]; v, о] <” 

are rejected in post-hoc tests. . 

When post-hoc tests are carried 

power than when eqn. (15) is used 

in eqn. (30) than in eqn. (23) when 

that eqn. (29) gives greater theoretic 


out by eqn. (29), there is greater MÀ 
and the value of vA [E]; vi, 00 is емей 
окт is false апа Сту is used. It is not e 
al power than eqn. (15) for all planned M 
thus for $4, —4,—0, when Фу=0-4, $,—0-6 and 4, =0-2, eqn. (30) gives A pos 
but eqn. (23) gives A=n/8. The expression of alternatives to oki се «Ае 
logical relations between decisions, when some oki are rejected, are clear y ased 
complex in the case of eqn. (29) than in eqn. (15). Whether the € ie 
power of post-hoc tests by eqn. (29) justifies the increase in complexity wi 
discussed in Section 6. 

The transformation sin- 


1, which seems to have been used first by Fisher 
(1922), has been recommen 


ded by Cohen (1967). Since this is not a ad 
function of фу, an hypothesis which is true for фу may be false when the € 
formation is used: thus 7%-2%-4%-%-0 is true when ф, = 0-8, p= e, 
¢3=0-9andd,= 0-8, yet 7 sinc 1! —2sin-1 $i 4 sin-i $5! — sin! $,* =0 is fa ü 
its value being 7(1-1071) — 2(0-8862) —4(1-2491) 2131071 = — 0-1262 tadien 
This departure from zero is over-impressive because no allowance has been ma 
for the rather large У) c? — 70, 


* п 
Itis equivalent to а departure of a single ф from 
its true value by 20-015. 


А small trans 
contrasts for which d;= 


5 Р : an 
formation artifact occurs with many 
0, but not for a 
comparisons, If 


Р i for 
ll; thus there is no such artifact 


1 
d 5-2 sini да en 
an 


32) 
24=2 sin-1 bj, 
pressed in radians, has 
=1 is replaced by 1— (4 


d 
then 27, when ex variance 5-1, If p;=0 it is replace 
by (4)1 and p; 


) in eqn. (32). The hypothesis 


ot = Cis + cine, +... саёд =0 
is rejected when 
2 34) 
Fan(s cues) [xad o вы]; ур 00 i 
j j 
With v, — 1 for Planned con 


oring 
trasts and ”ı=a—1 for post-hoc contrasts. Ign 


set 0 
-central parameter for each раст” 
FE gonal contrasts which gives an expected proportion of ге)е 
is 


AER]; "p coma eats) [s су? (39) 
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when eqn. (34) is used and g=[Fm/F[£e]; уу, o0] <», contrasts are rejected in 
post-hoc tests. Here Fg —n X(u-2.)* in which z, is the mean of the 2). How 


М 1 H 
well eqns. (34) and (35) approximate Ea and EB will be compared with the 
results for eqns. (15), (23) and for eqns. (29), (30) in Section 6. 

Тһе test techniques given by eqns. (15), (29) and (34) will be labelled the 


Ку, Gv, and Zv, methods, respectively. 


5. EXAMPLES 
In a study of treatment of first-referral psychoneurotic patients, all patients 
ed and their homes will be classed by the social worker 
ive’ (S). Patients in each of these groups 
Will be assigned at random to one of the four treatments, day-hospital with 
psychotherapy (HT), day-hospital without psychotherapy (HT), out-patients 
with psychotherapy (НТ) and out-patients without psychotherapy (HT). 
There will therefore be a—8 populations labelled 1= SHT, 2= SHT, 3-SHT, 
4—SHT, 5= SHT, 6=SHT, 7 - SAT, and 8—SHT. The psychiatric health 
of all patients will be assessed before treatment starts and again three months 
later. Patients showing ' significant improvement ' in the opinion of the 
assessing psychiatrist will be classed K and the remainder K. 
Тһе following null hypotheses are proposed 

ок Las t$a- do do P F005 

oa 7 di dafa LR 03 

фанда da bo bet 0; 

sadi dada da da Pot Pr +фв=0; 

фаба би bot bom Pr Fhe: 

pM MA 

Padi da а бева би 05 


with alternatives 
i= а=) 
T a к enm ч. E sath зме? h iac rM were 
hese ок; ha be identical to the hypotheses which wou 
ppen to e 1de : [pot ee E m 
the data subjected to a 23 factorial analysis, 1.6. the oki mpra ie ce A 


zH respectively. Each А; = c i 
Eanes ымен ع‎ "a The study is carried out with n-53 in 
Sach sample an d die proportion in each group showing po» putt 
Ment’ is found to be ру=0'9% 0-62, 0:85, 0-81, 0:85, 0:57, e à Ae a cea 
(f; —50, 33, 45, 43, 45, 30, 35, 37) giving Li— 1283, 0-011, 2:385, И, 
2385, 0011 and F,-6:843, 0.059, 12720, 0-907, 0:507, 12-720, 0-059 (=1, 
и аге greater than F[0-01]; 1, o0 = 6۰635; 50 ку, 

0 for these, уку, 1*s and кұ are 


2,...,7). Only F,, F, and Е, а 
os, and „ке are rejected and, since X cub? 
accepted. The value of Xm? from the 2 x 8 table is 34-364 and this differs from 


У Fi-33:815 only by rounding error. 


Will have drugs prescrib 
as ‘ supportive ' (S) or ‘ non-support 
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From these decisions, 


5,7 = (216 (1-4) 0 29(1-ФЛ% 0 0 2[40—4)p 0) 
and eqn. (8) gives 


#—%.=1[Ф.(1—Ф)}, —4{Ф.(1—.)#, 11601), НФ.(1— Ф), 

= Ш9(1-Ф)Ь – 3191-4 yp, -12.(1—9.)), – (1 ee | 
Whereas the decisions made indicate that а supportive home is associated with 
improvement, as is psychotherapy, and that psychotherapy is more effective vi 
day-hospital patients than Out-patients, much more information can be gleane 


from the ET implied by these decisions. Thus in decreasing order of 


effectiveness the treatments are, SHT, SAT = SAT=SHT, SHT=SHAT=SAT 
and SHT. 


The illustration of the R1 method above may be compared with the G1 


method in which Fi, =7-435, 0-064, 13-821, 0-985, 0-550, 13-821, 0-064. Bach 
of these is 1-5/1-38 times the corresponding F; in РІ. This is because У Cif 


is constant for the seven hypotheses considered here, as is 
and Y, с°ру(1 —Р))=1-38 while в 
j 


exist for any hypothesis, e.g. for %-%:-0, F—53(—0-092/2 x 0-1875 = 1:14 
by R1 and F=53( — 0-09)?/(0-57 x 0-43 + 0-66 x 0-34) = 0-91 by G1. 

222 = 2-65, 1-81, 2-35, 2-24, 2:35, 1-71, 1-90, 
1-98 radians, giving Е,= 8:163, 0-016, 15 


:106, 0-637, 1-008, 13-929, 0-001. 
methods lead to ident 


not always occur. 
As an illustration of post-hoc tests consider the 


mple of n subj 


j В 
[с] for all i and j, 
-2)Xcf-15. This advantage does not 


ical decisions in this illustration, 


s distribution may differ. Тһе sample 
re be divided at their common, observed median; the proportion 


Median test makes 5. —0-5 and, although 


m in the hyp i 
be considered in t 
г n and Table 2 Shows that 
Cost of a slight drop in Ep, n=20 will be dra 


Å 
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The investigation is carried out, the frequencies above the common, 
observed median are found to be f;=6, 14, 7, 4, 13, 16, giving р = 0:30, 0-70, 
0-35, 0-20, 0-65, 0:80, Fm=488 and = [4-88/1:371] = [3:6]=3. If ‘simple’, 
mutually orthogonal contrasts are sought for rejection, fewer than q can usually 
be found and more than one set of such contrasts, each with an equal number of 
rejections, may be possible. Thus two decision sets, each with two rejections, 


these data justify are 


$y—¢o= —[0°86.(1- 4.) and ф-Ф%--1089(1-90/І 


4,- 4s -[0:89.(1 - 4)! $1—$2+$a—$5= - [65.1 - 2) 
ф-Ф-0 b1—$4=0 
ф.-%-0 


фу-Фі-%%%-9 
24,:29,-%-%-%-%-0 d, 4s - 20, bat b5— 20070. 
Other sets are also possible. The $;— 4. implied by the above decision sets are 
ф-Ф-0 0, — (0:29(1-ФЛЬ- (0:29.(1-ФЛЬ [0-24.01 -ФЛЬ [26.1 — 2)? 
for that on the left and 
4 -4.- - 1014. -$ [140-7435 — 0261-4), [916.1 -4)r 
(0:19(1-9.)Ь (029(1-9./% 
Clearly the latter decision set makes finer and more 
iscriminati than the former. The set selected by the investigator should be 
diser ite relevance to the object of the research and the plausibility of its 
i retation in the light of this and other evidence. The direct calculation 
ad by eqn. (25) avoids the need to choose among alternative sets of 
аан but a choice must then be made of the minimum size of an effect that 


will be interpreted. . 
In these data eqn. (25) gives 
8-6. = (P172. -%/МЕВІ; v, oo/SSB]! 
= (pj 05). 79) x3x 8321/6-1]* 
-2023(5 - 0:5)9.(1 - $2 
ІҒ results are rounded to two decimal places, this yields 
ф-Ф.= —0-40[9.(1 - 4] 0404(1-ФЛЬ -0309(1-%ЛЬ 
М(1-ӨЛЬ 06491-49 


-06491-4ЛЬ 030 
known, but were it 0-5, the values of фу would be almost 


This is not generally true but, since 
= SSB[np.(1 -.)FUo]; ур 0 
then should ¢,=p,, eqn. (25) yields 
$y—$.= (0) - PJIAUEBT v» co v,F[Ea]; vy, 0}, 
Ignoring the fraction truncated in 4 and supposing 
f ĝ is greater than that of ру when A[EB]; v,,oo 


for that on the right. 


The value of ¢, is un 
exactly those of pi 


(apart from the fraction truncated), 


and this is generally true. 
¢,=p., then the variability O 
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iability i EB]; v, © «v, [Ек]; 
F[Ea]; v, оо, and the Variability is legs When A[ 
"ic үөн. of Tables 1 and 2 show .=Ż., to claim as much 
A aan РЕЗ 2 


Fn Р cmibs) f / aE ¢mj%,(1—p,) = 2000:305):/5(0.0558) = 6.66 
j j 


9= [6:66/1 `371] = [4-86] =4, 
simple ’ contrasts from thi 
%-Ф-- [0-4(4,(1— 4.) * (1-9); 
$= %-- 10-49, (1 —4)4 %(1- Ф)))8; 


$i4,—4,— =0; 
Pe he a — 24 64. 9, 
=1-52, 2-08, 2-08, 0-11, 0-34, e.g. 


2 
Fs cuts) [^X «ena 7) - 2-94; 


The greater value of 4 here than i 
of Gy,. Alternative decision sets 
not be illustrated here, 


For the 75 method 2-2 sin-1 pe = 1۰1592, 1۰9824, 1:2660, 0:9274, 1:8754, 
2:2142 radians, giving Fy “пу (ау— zy. =20х 1:358/5 =5:410 
4=[5-410/1-371] — 3, Contrasts across & Corresponding to the two 


Sets іп gy given for R5 show the same Patterns of rejections 
Thus for &-&=0, 


*/5(0-2275 + 0-1600) = 2.08. 


n R5 is typical of the 


Breater average power 
сап occur with 


Су) as with By, but this will 


and 
alternative 
as shown there, 
Pins аргу, у, ^if —20( 0.9489 


)%5 х2-. 1-798 > 1.371 ; 
50 2-6 (0-8)# is asserted, Since 


Ai- 0g and Ar- nb y eye, %-Гау eii 
3 j 


i 
1 5104n x 2/n]è = (0-8), 
to calculate 8-4 by 
$. is 


=0, where фу 
Population / to the right of М e—the value of 
Median ; t les, 


Fig. 1 illustrates the above 


P. 
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argument. "Тһе four variates on the left have pe=8 and the two on the right 
ше-5. Тһе фу are the proportions of observations to the right of the common, 
observed median of all samples M,—6:1. In spite of equality of the we, the фу 
differ in the four populations on the left, except in the unusual cases where 


фу = 0:82. 


4 6 8 0 12 2 4 e в 19 12 14 1 5 5 T 9 


Population variates with four е 8 (on the left) and two pe=5 (on the right): 
sample М,=6:1. The фу are shown in the distributions. 


Ficure 1. 
6. Some SAMPLING EXPERIMENTS 

A number of random sampling experiments were carried out on a CDC 3600 
computer to demonstrate and examine the adequacy of the approximations Куу, 
Gv, and Zv; Twenty-four sets of 200 experiments were run, each experiment 
with а-10. These twenty-four sets were the combinations of n=5, 10, 20 
and 30 with six sets of фу, which were (i) all 4;—0-1, (ii) all 0-2, (iii) all 0-3, 
(iv) all 0:5, (v) 4; 0:37, 0-23, 0-27, 0-13, 0-30, 0-16, 0-20, 0-06, 0-21, 0-07, with 
ф. —0:20, and (vi) Ф =0-98, 0:58, 0-71, 0-30, 0-78, 0-38, 0-51, 0-10, 0-53, 0-13, 
with ¢, — 0:50. 

In each experiment 7 pseudo-random numbers (range 0 to 1) were generated 
for sample j (j= b 2 => 10) by a version of the congruence method given by 
Rotenberg (1960), the proportion of these greater than фу was ру. Only contrasts, 
with dı =0, were tested with Ea—0-05. R1, СІ, 21, R9, G9, Z9, and the 
classical X? test were calculated from the data of each experiment, the latter 
with the ' experiment-wide ' error-rate «=0-05 using F0-05; 9, 00 21:88. The 
error-rate stated for a classical X? test refers to the decision made about the 
truth of the hypothesis == ... = ha with only one decision made in each 
* experiment ". Тһе probability of rejecting this hypothesis when true is « and 
of retaining it when false 1 — B, which is the area to the left of Fa; v,, co in the 
variance-ratio distribution with v, oo degrees of freedom and non-central 


parameter 
As ch (4;— 4.19.1 — 9.). 


ng this is generally greater 
Although the X? method 
is not recommended here, it is used for comparison with Ru, Су, апа Zu, 
methods. Тһе number of mutually orthogonal contrasts which may be rejected, 
post-hoc, by this method is I= [Xm fv Fa; Ур O] <v, and for this purpose the 
expected proportion of rejections is» ғтұ|у, where the z, are areas under the 
r=0 
variance-ratio distribution with "p © degrees of freedom and non-central 
parameter An-n Y (5-6) (1-4), Which is zero when фу=фь=...=фа. 
1 


These аге the data reported below, not x and 1—8. Since 


X? is not recom- 
mended, the question of the closeness of the Fa; V 


© approximation is not 
examined, | 
Nine mutually orthogonal contrasts were Specified for planned tests, with 
^71, which were 


o1 =$,—4,=0; 0%-фҙ-дфі-0; а<%,4%,-4;-%;,-<0; 
%4-фФ-ф-0; 957 d, — d —0; oo =P - d, — 4, — bg =0; 
o = 49— dig =0; POLLS PR P 
PRU LUCI HR UM 

and used in all experiments, 


In cases (v) and (vi) above, the planned Comparisons haye values 0-14 and 
0-40 (approx.) respectively, which are near the bottom and the top of the range 


recommended for < nation ’ in Section 3 above. In these 
Cases уку is true for all planned of п used are small; 


of rejections of contrast 7’ in the 
/' in the second row; 

79, The expected frequencies used in cases (i) to (iv) were 10--200Ех in the 

„ Ёа) in the second TOW. In cases (v) and (vi) саи 
Served number or correct acceptances of үкү? an 

Observed number of failures to accept ік)” with expected frequencies а 
i is procedure does some injustice to R 

е expectations for i are Correct, the tests of contrasts in 


E. А nt 
not Statistically independent Owing to the common eleme 
ators, 


; he 
for post-hoc methods was the chi-squared calculated by using t 
times p cont 


is the 
masts were rejected as observed frequencies (r is 
Observed value of 4) and 2005, as the expected frequencies. The т, beri 
‹ distribution With 9, co degrees of mesa "Each 
98 (0 to (іу) and non-central for cases (v) and (vi). 


3 
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calculation of ‘ fit’ for post-hoc methods had ten possible cells, but cells in the 
tails of the т, distribution were combined to give expected frequencies at 
least 2-0. In cases (i) to (iv) the value v—2 and v varied from 1 to 7, usually 
being from 3 to 5, in the other two cases. 

Were the sampling distribution used for statistical tests exact rather than 
approximate, ‘ fit’ would be distributed in central variance-ratio form with v 
and со degrees of freedom. Although tests of significance are inappropriate to 
examining the goodness-of-fit of approximations, any value of ' fit’ less than 
2-0 indicates good agreement between the approximation and observation, after 
allowance for sampling error. 

Total numbers of false rejections of contrasts for cases (i) to (iv) are given 
in Table 3. For each cell, any discrepancy of 16 or more between observation 
and expectation would be ‘ significant’ by 20:05; 1, co. ‘Fit’ was averaged 
over ф for each x and over z for each ф, and the results are also shown in Table 3. 


OBSERVED NUMBERS OF REJECTIONS OF oki: ALL CONTRASTS TRUE 


TABLE 3. 
(Expected frequencies shown at the top of each column) 
Ri G1 21 R9 G9 29 ж 
Е 90 90 90 90 90 90 10 
n $ 
0-1 94 76 19 82 20 8 5 
5 0:2 80 88 71 86 132 45 10 
0:3 103 138 126 104 229 96 7 
0:5 62 170 174 99 341 127 8 
Ave. fit 1:36 1143 6:12 166 177-74 22-03 
0-1 95 107 29 103 50 48 17 
10 0:2 99 137 94 95 120 100 11 
0:3 69 111 94 84 150 118 12 
0:5 76 95 102 82 151 116 5 
Ave. fit 0:88 2-41 2:92 0:82 30-04 748 
0-1 106 99 90 98 75 100 13 
20 0-2 94 94 115 97 116 118 6 
0:3 104 111 123 99 131 116 12 
0-5 73 94 97 97 139 107 14 
Ave. fit 1:13 0:96 1:32 0:86 12:20 4:48 
0-1 82 87 97 77 74 99 8 
30 0:2 94 101 107 89 98 102 
0-3 86 93 101 90 103 93 4 
0:5 102 104 106 104 126 116 13 
Ave. fit 0-78 0:80 0-88 113 427 2:59 
0-1 0-74 7:23 3-32 1-03 15-43 19-38 
Ave. fit 0-2 0-92 1:94 1:28 0-64 7:51 6:50 
0:3 128 2-07 1:99 1:83 61:57 416 
0-5 4:22 4-06 407 0-99 139-74 6-54 
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АП but one (R1 with 1 — 20) of the values of ‘ ave fit’ i 
by a large amount. Although ‘ave fit’ decreases for Gv, and Zv, as n increases, 


all values of ‘ave fit’ for Rv, are less than the criterion of good fit (2). The good 
fit of Rv, in Table 3 is an important reason fo 


It is usually claimed that the X? approximatio 


favour Бу), usually 


claim 10) to be satisfactory, Ry, 
smaller figures than this, In ten of the sixteen sets 
in Table 3, the expected frequencies in class K we 


TABLE 4, OBSERVED NUMBERS (О) AND ExpPECTED Nu 


MBERS (E) OF DETECTIONS 
OF ki (PLANNED Tests) AND REJECTIONS OF 


okt (post-hoc Tests) 
Set (v) $j=0:37, etc, 


n Ri G1 21 R9 G9 Z9 x 
[2] 128 180 111 155 222 87 28 
9 UE 156% 160-6 161-8 152.6 157-9 160-0 324 
Fit 2-96 11-33 4-46 1:25 8:65 21:59 
O- 213 291 204 210 238 212 60 
0 E 2299 2334 235-8 2140 224-3 228-3 63:3 
Fit 2:91 2:69 4-39 012 0-64 1-20 
O 320 369 404 316 360 377 117 
20 E — 3548 381-9 386-6 334-0 354-2 3620 133:8 
Fit 2:62 1:08 1-79 2:60 0:88 0:97 
O 469 556 573 441 
476 520 196 
30 E 489-9 529-0 535-8 453-1 483-4 495-2 2016 
і 241 1-00 1-34 0:79 0:49 1:20 
Set (vi) $;— 0-98, etc. 
т 
Te is Е, 597 335 1460 438 149 
з 25 284 7*9 — 4047 75% 5307 1746 
5-39 1311 50847 10-60 
0 
75 mus tree 976 667 1701 950 313 
fn 200 2, 9984 7138 15640 965:3 3428 
445 11-78 18-83 438 
0 
Deg үз i. 1536 1285 1798 1696 650 
it 42; 1502.5 1315-2 1799-9 1660-8 679-3 
3-16 3-28 10-58 — 1-76 
0 
0 © TP 108 1725 1739 1800 1800 984 
Fit Я $a бз (ме 1800 17970 10157 
34 3-74 2:91 
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'The results of experiments in cases (v) and (vi) are reported in Table 4. 
Тһе methods for post-hoc tests іп set (v) satisfy the criterion of good fit when 
n> 10—the larger value of ‘ fit › for R9 with 2 —20 may be discounted as due to 
extreme sampling error. In this set G1 and Z1 satisfy the criterion for 7 > 20, 
but R1 is not much over the criterion for all л. 

Only one of the values of ‘ fit’ satisfies the criterion of good fit in set (vi). 
The ‘fit’ values for R9 when n> 10 exceed the criterion because the observed 
values of Fm have a smaller variance than that of the non-central F distribution 
used; yet the means of the observed and theoretical distributions agree quite 
closely, the proportion of oki rejected is therefore quite close to ЕВ. This 
reduction in variance seems to arise from an ‘ end-effect ° due to the extreme 
фу = 0:98 and 0:10. Were the distributions of ру for these фу more symmetrical 
around d; (as is assumed in the derivation of non-central F), say 0-90 < p; < 1:06 
and -0:10<р/<0:30 respectively, the reduction in the variance of Fm should 
not occur or would be less, but such ranges are, of course, impossible. Such 
* end-effects" disappear as л increases because the standard deviation of фу 
diminishes and there is room from 0:98 to 1 and from 0 to 0-1 to provide an 
approach to symmetry. The influence of ‘ end-effects can also be seen in 
planned comparisons ку and ку, which involve фу-0:98 and 0-10, reducing the 
number of correct detections below expectation, but this phenomenon dis- 
appears when 1220. In оку and oka which involve $;— 0:06 and 0-07, in 
set (v) ‘ end-effects ” show similar results up to 2—30. 

In post-hoc tests by Ку, when the фу range from almost 0 to almost 1, the 
values of ту around v, (say „„ЕВ +1) аге underestimates and those beyond 
this range overestimates, but a wide range of фу gives large EB, v,EB approaches 
ту арргоасћеѕ 1 and almost all contrasts will be rejected by all three methods. 
This happened for G9 in set (vi) with п > 20 and it was impossible to calculate ‘ fit’. 

It is of interest to note the number of times the use of planned tests in 
sets (v) and (vi) led to the acceptance of әкі. These so-called type III errors 
are really accounted for in the type II error-rate and a type II error should be 
defined as a failure to detect the true alternative hypothesis. The relative 
frequency of accepting әкі When үкү is true should always be less than 4а. 
In these experiments such errors should occur less than 0-025 x 1800 =45 times 
and as Ef increases their frequency should decrease. Іп set (v) R1, G1 and 21 
produced 12, 14, 12; 8, 12, 6; 3, 2, 2, and 1, 2, 1 respectively for n=5 to 30 
respectively. In set (vi) such errors occurred only when 1= 5, their frequency 
being 0, 1, 1. 4 

After post-hoc tests by R9, eqn. (25) may be used to find фу and the agree- 
ment between these and фу may be measured by Y($;—4;)". This quantity is a 
function of q and the correlation between ру and фу (say р), because 


x -4 =Z -lal olala] +1). (36) 


2 . 
Тһе mean values of p, for n=5 to 30, observed in the sampling experiments 
were 0-60, 0-71, 0-76, 0-85 for set (v) and 0-84, 0-86, 0-95, 0-97 in set (vi). Not 
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ae beant 
nly is mean p an increasing function of Ef but, within any set of experiments 
only А 
iti i sing function of 4. | " ü 
m Of 800 experiments in set (v), 682 had g>0 and in these p>0 : in 22 
68/156, 127/193 and 175 out of 197 for n=5 to 30 respectively. ^. yd 
experiments was p negative, being 0> p> — 0-25 in set (у) when z= : e wo 
experiments in set (vi) had д> 0, in no case was p negative, p «0-7 in only 


experiments when л —5 and р> 0-9 in 58, 95, 191 and 200 experiments for 2= 5 
to 30 respectively. 


Inspection of the E values in Т 
power for the three methods Ry, 
differences, particularly favouring 
marked for post-hoc than for pla 


able 4 shows no appreciable difference m 
Gv, and Zv, in set (v). Appreciable power 
С over R, appear in set (vi) and these are more 
nned tests. This advantage for G or Z ш 
be weighed against the generally poorer ‘ fit’ of these techniques Len rdi 3). 
which is especially pronounced for post-hoc tests when all oki are true (Table 
When 10<n <30, ‘fit? seems 


to favour R. If » were increased beyond this 
range, 


* fit? should improve, as would EB, for all methods; so the choice = 
would depend upon ease of calculation and simplicity in conceptualization, whic 
also seem to favour Р. 
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ERROR OF MEASUREMENT AND THE POWER 
OF A STATISTICAL TEST 


By T. ANNE CLEARY 
University of Wisconsin 


and ROBERT L. LINN 
Educational T'esting Service 


Formulae are developed from the assumptions of classical test theory to 
demonstrate the effect of error of measurement on the power of the F test for a 
fixed-effects one-way analysis of variance model. A simple model specifying 
the fixed and variable costs of testing is adopted and equations are derived that 
indicate the sample size that maximizes the non-centrality parameter subject to the 
cost constraints. Given the sample size, the corresponding test length is then 
implied by the cost model. A computer program used for the estimation of 
power for permissible allocations of resources is described. 


1. PROBLEM 


Discussions of the power of statistical tests can be found in almost all basic 
statistics books. Implicit in the usual discussion of power is the assumption 
that the observations are errorless or ‘true’ measurements. Sampling error 
rather than measurement error is considered. 

Тһе test theory literature, on the other hand, is concerned primarily with 

|. the error of measurement (Gulliksen, 1950; Lord & Novick, 1968). Observations 
А are considered fallible and repeated measures of the same object are expected to 
y* ‘vary about the ‘true’ measurement, their expected value. 
"Lo Sutcliffe (1958) has considered the two types of error simultaneously. He 
| elaborated the implications of measurement error for the F test of differences 
between means and demonstrated how measurement error decreases the 
" sensitivity of a test of significance. 

Lord & Novick (1968) have discussed the implications of an item sampling 
model for mental test theory. In the item sampling model each examinee 
receives a random sample of items from the defined universe of items. Lord & 
Novick have shown that item-sampling methods can improve the efficiency of 
an experimental design. Тһе gain in efficiency is achieved by increasing the 
number of examinees while decreasing the number of items per examinee. 

Тһе item sampling model has strong advantages in many group comparison 
situations. However, practical administrative considerations such as the need 
for common instruction and testing time, the economy of being able to use a 
single scoring key, and the fact that test data must frequently serve several 
purposes often make it desirable to administer the same test to all examinees. 
In such situations one is faced with the problem of deciding whether it is more 
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efficient to control the power of a planned statistical test by changing the SURE. 
of examinees or the test length and thus the variance of the error of measu reme м 
Overall & Dalal (1965) discussed the problem of choosing a research desig E 
which maximizes power relative to cost. 'They concluded that, if there p 
fixed cost per measurement unit and this cost is the same whether the units a з 
obtained for the same subject ог different subjects, it is always better to maximiz 
the number of subjects. : of 
The purpose of this research was to develop, from the assumptions А 
classical test theory, formulae demonstrating the effect of error of measuremen 


on power. Also investigated were the implications of various assumption 
concerning the fixed and variable costs of testing. 


2. STATISTICAL TESTS 

In the derivation and inter 
generally considered to be fr 
of test theory, the observatio 


pretation of statistical tests, the observations ae 
ee of error of measurement, that is, in the languag 


ns are true scores. If the hypotheses are formulated 
in terms of true scores and tested with observed Scores, the non-centrality para- 
meter and, therefore, the power can be quite different from what would ре 
expected with true scores. Failure to reject the null hypothesis with observe 
Scores is obviously not equivalent to a failure to reject the null hypothesis with 
true scores. 


Consider a simple fixed-effects, one-way analysis of variance. "Тһе model 
for this analysis is 


Tg —- M t Ag+ Big (2=1,..., © 
Where Ti, is the true score for individual i in group g, 
M isthe population true-score mean, 
Aq is the component of the true score which is due to the effect of treat- 
ment g, 
Big is the deviation of an i 
error of the analysis of 
Тһе Big is assumed to be in 
expected value of zero and commo 


; i=1,...,.N), . @) 


е. 0 e 
ndividual's score from the group mean, s 
variance model. ith 
dependently and normally distributed = 
n variance, og. The Ag are unknown P* 
168 
constants ande. У 4? is called og?. 
g-1 


Squares for this model, 

If the null hypothesis of no difference between treatments (са? =0) is true, 
the test Statistic (the ratio of the mean square between groups to the же 
Square within groups) is distributed as central F with (С-1) ала G(N — ) 
degrees of freedom, If the null hypothesis is not true the test statistic 18 
distributed as Don-central F with the same degrees of freedom and non-centrality 
parameter, 


Table 1 presents the expected mea? 


Ap= GNo,?/o,2. Q 


Error of Measurement and the Power of a Statistical Test 51 


TABLE 1. EXPECTED MEAN SQUARES FOR А ONE-WAY 
ANALYSIS OF VARIANCE OF TRUE SCORES 


Source Degrees of freedom е(М5) 
Between G-1 в 
NEA, 
жі +ов? 
Within G(N—1) св? 
Total GN-1 


If, rather than true scores, observed scores are used in the analysis, the 
model is 
Xjig=M+A,+ Bigt Eip (3) 
where X,, is the observed score individual 2 in group 0, 
Е, is the measurement error for individual 2 in group 0, and 
М, Ap Big are the same as in the true-score model. 
For each group, g, the measurement error, Eig, are assumed to be indepen- 


- dently and normally distributed with expected value of zero and variance, o;?. 
Тһе expected mean squares for this analysis are shown in Table 2. 


TABLE 2. EXPECTED MEAN SQUARES FOR А ONE-WAY 
ANALYSIS OF VARIANCE OF OBSERVED SCORES 


Source Degrees of freedom «(М5) 
Between G-1 G 
" N = Ay 
E оной 
Within G(N —1) ав? ор? 
Total è GN-1 


Jf the null hypothesis (c ®=0) is true, the test statistic has the same dis- 
tribution as in the error-free case. However, if the null hypothesis is false, the 
test statistic is distributed as non-central F with the same degrees of freedom but 
with non-centrality parameter, 


Ax 7 GNo 4 (02+ o). (4) 


For с? greater than zero, the non-centrality parameter for the observed 
analysis, Ax; 1$ smaller than the non-centrality parameter for the true-score 
analysis, Ap. Since power for the test with given degrees of freedom is a non- 
decreasing function of the non-centrality parameter, the power for the true-score 
analysis is always greater than, or equal to, the power for the observed-score 
analysis. 
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e 
Surement. The sum of K Such measures can b 


К . 

Xu” =K(M+ Ag+ By) + > Eigr, (5) 
kel 

where the E; are 


dently and normall 
meter for the analy 


the measurement errors for test 
y distributed with Variance oy, 
518 of the lengthened test is then 


k. The Eig, are indepen- 
The non-centrality para- 


Ax*=GNo,2/(0,2+0,/K2), (6) 


Ax*, is a strictly increasing function of 
both K and N. However, the effect of increasing N is relatively greater than 
the effect of increasi : ition, the effect of № upon power is increased 

т the denominator. 
If one defines the Within-group reliability of the observed scores, p, as the 
i Че Scores to the variance of the observed scores, 


p= € y" [(o ,? +03), 


3. Соѕт or an ExPzRI 
Itis obvious thatan experimenter сап always increase power by increasing N 
and/or К. However, in апу practical situar; i 
limited resources and 
appropriate power Within the 
Generally, t annot increase both Ду and K: 
N must be decreased. % 


MENT 


d of Cronbach & Gles 
I subjects and that t 
fixed cost, C i 


endent of test length and 4 cost per test unit, Cj. 
Тһе cost per group, C, is then given by f 
C=N(C,+ Kc) 0 
Where N is the number of pe 


b People per Broup and К is the length of the test. 
plies that for t per cell, а change in test length, 

from K to K*, must be accompanied by a i 

cell from N to N*, where 


p 
x 


ri 
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In the special but rather unrealistic case where C, is equal to zero, maximum 
power will always be achieved by setting K equal to one regardless of the test 
reliability. This conclusion was drawn by Overall & Dalal (1965) and can be 
seen by noting that for C, =0, the cost per cell, C, a function of the product NK. 
For a fixed product, NK, both the non-centrality parameter and the degrees of 
freedom аге maximized for K— 1. The non-centrality parameter, Ax*, is 
maximized with respect to № and K, subject to the constraint of eqn. (7) and 
the restriction that K and N are positive, when 


N* if N*«C[(Co- C1), 
ж? Tene. eco if N*» C/(Co- C» 9) 
where т 
©» 
a Coop + o (CCS) uo 
The corresponding value of K is 
K* if N*&CK(Cy Су), 
й t if N*» С|(б,+ С), (4) 
where 
K* = og|os(Co] C1)". (12) 


Clearly, it does not follow that the power is maximized by these values of N and 


'K since the degrees of freedom are not necessarily maximized. In practice, 


however, it might make little difference whether the experiment was designed 
such that Ax* or power was a maximum since (i) the non-centrality parameter 
has a major influence on the power of an F test, (ii) only integer values of N 
would be used, and (iii) the initial estimates of Cy, Cy, og, and o,,2/0,2 would be 
approximations. 

A FORTRAN IV computer program has been written to estimate the power 
for various allocations of resources.* The program computes the maximum 
number of persons per cell permissible within given cost constraints. For each 
value of N from 2 to the maximum, the corresponding К is computed. Лх“ is 
calculated using formula (6) and parameters provided by the user. Power is 
estimated using the Tiku (1965) three-moment approximation. This Tiku 
procedure uses central F with the same first three moments of the desired non- 
central F. "Тһе central F integrals are evaluated by a subroutine FDISTR 
written by David Doran at the University of Minnesota. 

Fig. 1 presents the power estimates for four different cost conditions where 
cato =} оё = 90g and C= 3000. These relative magnitudes of су and а? 
correspond to a unit-length test with a within-group reliability of 0-10. When 
К —18, the reliability of the test would be 0-75. Power ranging from zero to 
one is on the ordinate and sample size on the abscissa. The non-centrality 
parameter in these examples is quite large, and, if true scores were used, the 


* A listing of the FORTRAN Iv computer program may be obtained from the authors. 
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power would be close to one for all 


Curves represent different values of C 


i e second 
parentheses immediately below each curve: the first number is C, and th 
number is C. 


го 


(80,20) 


(50,50) 


(0,100) 


Sample Size 
Power and Sample size for differ 


C=3000, and G=3) 


9 2 
Fic 1. ent cost constraints (o42=1o,2, 05? — 9a p^, 


In the lowest curve, C, «0 and C, —100, the maximum 
at N=30 and езі, Та this case there is no fixe 


Ways be achieved 


power is achieved 
d cost of testing, and the 


- А Te 
curve, it readily can be seen that the cman n 
leved by the lar : mple size. Rather, = 
at a sample size equal to sixteen. Here where the 

1 ; 


. . . 2 e 
; 5 characteristic of the four curves is that the height of the curv 
Increases as С; Increases, At апу gi 

these curves i 


obtained by PERENNE "a 
The program makes it possible for n 
possible alternatives. In order to 
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the program, of course, the experimenter must be able to estimate the various 
cost factors, the true score differences which he wishes to detect, and the 
reliability of the unit-length instrument. However, the computational speed 
of the program makes it possible to investigate power for a number of alternative 
estimates of the parameters. (Power estimates are computed at the rate of 
approximately 1000 a minute on the IBM 7044.) 

We have discussed here only the fixed-effects one-way analysis of variance. 
However, the procedure can be readily generalized to any fixed-effects design. 
This general approach should prove useful for consideration of many designs. 
Power estimates for the random-effects one-way design require only evaluation 
of the central F distribution; for the other designs, power estimates are more 


complicated. 
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A DESIGN AND MODEL FOR ESTIMATING THE EFFECTS OF 
ONE TRIAL ON THE NEXT IN A SEQUENCE HAVING TWO 
TYPES OF TRIAL 


By J. С. OGILVIE, C. T. SURRIDGE and А. AMSEL 


University of Toronto 


A statistical design and model is presented for estimating the after-effects of 
the treatment given at trial n on trial 2 +1, when there are two kinds of treatment 
i on of the model is to an evaluation of the effects 


in a series. The specific applicati 
of stimulation produced by reward and non-reward in a partial reinforcement 
situation. A rationale for the model is provided; the derivation of the estimates 


and the analysis of variance is described; and the model is applied to the analysis 
of some data which form part of a study published elsewhere. The results of 
the analysis indicate that reward and non-reward affect immediately following 
performance differentially at 20 sec. intertrial intervals but not at 12 min. 


intertrial intervals. 


1. INTRODUCTION 


When a subject is tested several times under different treatment conditions, 
the effects of a treatment may carry over to subsequent trials. Such carry-over 
or residual effects are usually regarded as a nuisance, to be eliminated statistically 
or by a change in the experimental conditions. Sometimes, however, the 
residual effects are interesting in their own right. Тһе purpose of this paper 
is to present a design and model for estimating the after-effect of the treatment 
given at trial n on trial n+1 when there are two kinds of treatment. 

Тһе specific application of the model is to an evaluation of the effects of 
differential stimulation produced by reward (R) and non-reward (N) ina 
discrete-trial partial-reinforcement (PR) situation. As an example, the model 
will be applied to the analysis of some data collected by Surridge (1967 a, b) in 
the context of a broader study of temporal effects and patterns of reward and 
non-reward in the acquisition and extinction of instrumental responses. 

The most influential current intertrial interpretation of PR phenomena 
assumes that reward and non-reward produce distinctly different stimuli which 
remain functional and can be conditioned by reinforcement to the response on 
succeeding trials (Capaldi, 1966). Among others, Bloom (1967) and McCain 
(1966) have observed superior performance on trials following reward (TFR) 
relative to trials following non-reward (TFN) in PR situations. Both Bloom 
and McCain suggested that this finding is indicative of the presence of stimulus 
after-effects produced by reward and non-reward which affect successive 


behaviour in different ways. 
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i ifferential 
There appears, however, to be a need for evaluating the one ات‎ a i 
stimulation on performance on TFRs and TFNs in terms of a statistica pr 


The 
more specifically designed for this purpose than has been the case to date. 
method of analysis often used compares mea 


within-day or within-block effects (e.g. Surri 


ials are employed (e.g. Capaldi et al., 1962). 23 
€ present was designed for an 8-trial-per-day proce rm 
i nced in such a way as to provi : 

ty 4-day block, and to ensure ы 
any trial n except the last for each day will be followed equally often by an R or ни 
trial. In this Way estimates of the differential effects on TFRs and X F N 5 rw вї 
› and exact tests of significance used to ascertain the reliability of a 
differences observed. 


Sequences of length 27. With 
tials over 4 days. The design 


iving 

Day 
1 2 3 4 
1 R N R N 
Trial 2 N R R N 
e 3 N R N R 
4 R N N R 

Тһе s ; 


ite easily by adjoining two squares whose columns have 
TABLE 1, A SEQUENCE BALANCED FOR 
OVER 4. Days, 8 Т 


(Rand N Tepresent reinfo 


FFECTS 
REINFORCEMENT AND AFTER-EFFEC 
RIALS PER Day 


reed and non-reinforced trials, respectively) 


Sequence 1 


Trials Days 


ооч сл.» м سا‎ 
BPAAZAZARA- 
DJAZZ ZEWN 
ZRRHZmRHHZe 
ZZW wW e 
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d. However, most of the 8 x 4 rectangles obtained were 
le because they appeared too patterned or had four 
Тһе final choice is shown in Table 1. 
e balance and the trial order 


been randomly permute 
regarded as objectionab 
treatments of the same kind in sequence. 
Columns may be permuted without disturbing th 


can be inverted giving 48 possible arrangements. 
Тһе design described here resembles a Williams square but is somewhat 


simpler. The derivation of the estimates and analysis of variance for the 
Williams square is discussed by Cochran & Cox (1957, pp. 133-141). 
Тһе statistical model for the 32 trials in a 4-day block is 
Xy =p di tj rin + aa + 6, 

where X;; is the observation of the jth trial on the ith day, 

u isthe grand mean; 

d; is the effect of the ith day (¢=1, 2, 3, 4); 
is the effect of the jth trial (j= L2 9) 


ty 
fect of reinforcement; 7, for a reinforced trial and ғ; for a 


ry is the e 
non-reinforced trial; 

is the after-effect of the previous trial; a, when the previous trial 
was reinforced, аҙ when not reinforced; 

is error; independent, random, normally distributed with mean 0 and 


44 


ец 
variance o?. 
Тһе usual constraints apply, 
4 8 2 2 
У а= У у= Уу = Хад=0. 
іші ісі kel @=1 

The subscripts k and q, referring to reinforcement condition and after- 
effects, are defined by the subscripts i and j. 

It should be noted that ‘ after-effects ' is used here in а statistical sense and 
is not intended to carry any learning theoretic implications. 

In the derivation it is convenient both computationally and algebraically 
to assume that the grand mean has been eliminated from the model and hence 
that ш=0. This is equivalent to treating the observations as deviations from 
their grand mean, which presents no problems with a computer. 

The normal equations for the parameters obtained from minimizing the 


residual sum of squares are 


16f,=Rx (k=1, 2); 8d, d, - Ds, 
4j =T; (0=1-5 8) 8d, d, 7 Da, 
8d, + ds Dy; 144, + dg +d, - 25 = Ay, 


8d,-- d= D» 144, 4 d, +da- 2i, = 45, 
where the Ёк, T; and D; are the relevant totals of the observations and A, and 
A, are the totals of trials following reinforcement and non-reinforcement 
respectively. Note that YXDi-XT;-2XRr- T, Ay + А„=0 because the grand 


mean is zero. 
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The solutions to the equations are 


fy — RyJ16, i= T;/4, 
4,=(44,—44,+F,)/110, 


Ф- =y 
4,=(D,~4,)/8, 4 =(D,—4,)/8, | 
d, -(р; -а4)/8, d, = (D, -а4)/8, 


of 
for days is Y; diDi— y, D?/8. Hence the sum 


2/8 
iminating days is 414,4, A4, 4 y; \diDi-d E A 
Which reduces to (44, —44, + F,)2/440, If days are ignored Шеп the su 


of 
Squares for after-effects with d;=0 for all 2, is (4; — A,)2/28, Тһе sum 
Squares for days eliminating after-effects is thus 


EdDicYa,4, (А,— 4 


Тһе analysis of Variance 


TABLE 2, ANALYSIS or V 


(X has been subtracted from all observations, Dia d an 
R, is the total of reinforced trials. А, and А, are t i ollowing rewar 
non-reward, F, is the total of the two days i 


AY 
ARIANCE TABLE FoR 4-DAY Block, 8 Triars per D 


Source D.F. 5.5. | 
"Trials 7 % T;*/4 | 
Rein forcement 1 15/8 | 
ауз (ignoring after-effects) 3 > Djs | 
After-effects (eliminating days) 1 (44, —4454- F,)*/440 | 
After-effects (ignoring day) 1 (A, — А,)2/28 í 
ays (eliminating after-effects) 3 (44, —44,+F,)*/440 2/28 | 
+ 3D2j8— (A, — А„)°{2 | 
Error 19 by subtraction 
Total 31 > i | 
7 9 | 
‘als | 
: 1al8; ў 
This model assumes that there are no interactions between days, En e | 
reinforcement, and after-effects Such an assumption is usually | салары a 
with behavioural data, However, in this case it can be argued that аб 
four-day segment of behaviour the interactions are likely to be negligible T this 
possibly in the first four days of acquisition learning. Тһе plausibility o 
argument will be examined later. ill still Þe 
ДЕ there are real interactions, unbiased estimates of the effects will s 
obtained because о randomization 


pM 
» although there will be a loss of precisi ) 


^N 
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3. EXPERIMENTAL PROCEDURE 


'The model has been applied to the data of two groups of animals run under 


conditions which were identical in every respect except that one group (N— 10) 
was run at a 12 min. intertrial interval (ITI), the other (V=7) at a 20 sec. TEE 
The procedure will be presented only in summary form because the details are 


available elsewhere (Surridge, 1967 0). 
Two groups of albino rats were run 192 acquisition trials to food reward, 


8 trials а day. Тһе apparatus was a straight, 3 ft. alley, which provided three 
1 ft. time measures of locomotor performance. Subjects received a 50 per cent 
PR schedule according to the sequence of R and N trials described in Table 1. 
Тһе sequence and its inversion were presented alternately over the six 4-day 
blocks, the four daily trial orders within a sequence being randomly permuted. 
Тһе presentation of the different orders was also randomized across rats so that 
as far as possible no two rats received the same order on any one day. 

All analyses were done with reciprocals of raw time scores. The procedure 
given by Box & Cox (1964) has been applied to several other sets of experimental 
data using straight alleys and the reciprocal transformation was found to be the 


most appropriate. 


4. METHOD OF ANALYSIS 


Only the analysis for the third or goal measure will be presented. "The 
three measures are of course correlated so that a joint analysis of all three would 
require a multivariate analysis of variance. 'The goal measure is the one of 
primary interest. The multivariate analysis would not modify the essential 
conclusions but would complicate the presentation considerably. 

The finest grain analysis leads to 6 analyses of variance for each rat, one 
for each of the blocks of 4 days. This number of analyses for each S would 
lead to obvious difficulties of interpretation. Asa first step towards summarizing 
the results, the 6 analyses for one rat were combined as follows. Suppose there 
are k blocks of 4 days and that the separate analysis for each block has been 
done. Let Sp be the sum of squares for an effect in a single block (b — 1, 2, ..., k). 
Then Sp can be partitioned into two parts. The first is obtained from the 
formulae in Table 2 using totals across all # blocks and dividing the resultant sum 
of squares by k. This represents the main effect averaged over blocks of days. 
Тһе interaction is obtained by subtraction from У Sy. Sums of squares сап 
be partitioned in this manner for trials, reinforcement and after-effects, giving 
interactions of each with day blocks. 

'The rat by rat analysis is certainly useful, but further summarization is 
necessary to permit generalization across a group of rats. The simplest and 
most direct method is to take the estimates for one of the parameters for each rat 
and each day block and treat these estimates as the data for a two-way analysis 
of variance. In the case of after-effects only 4, is used because d; = — û. This 
leads to a rat x day block analysis, the interpretation of which is somewhat 
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unusual. Significance of either of the main effects in fact implies an мош 
of after-effects and either rats or day blocks; in other words, that a tere 2 
differ from rat to rat or day block to day block. If neither main effect is қР 
cant the overall significance of after-effects is obtained by testing се s 
grand mean differs from zero using the interaction mean square as error. 


я. 1% TM S be 
after-effects exist the expected value of @1 18 zero. Similar analyses сап 
performed for reinforcement, trials, and days. 


5. RESULTS 


The individual and combined analyses for two rats in the 12 min. "n 
group are shown in Tables 3 and 4. Differences between days within mee 
show up only in the first two blocks and this is typical of all rats in both groups. 


І 
TABLE 3. INDIVIDUAL AND COMBINED ANALYsES FoR Rar 3, 12 мім. ІТ 
Group 4 
Source D.F. Mean Squares Source D.F. М.» 
Days... 14 5-8 9-12 13-16 17-20 21-24 

52 
Days 3 742 055 053 042 028 019 Days 18 18 
Blocks (B) 5 0-40 
Trials 7 022 065 033 035 049 014 Trials 7 .36 
TxB 35 0 
Reinforcement 1 0:02 000 0.22 010 001 0:04  Reinforce- 0:04 
ment 1 0:14 

RxB 5 
After-effects 1 002 0-18 050 101 038 001 After- 0-51 
effects 1 032 
AxB 5 0% 

Error 19 026 016 0-41 040 0-54 028 Error 114 


TABLE 4. INDIVIDUAL AND COMBINED ANALYSES 


I 
ror Rar 9, 12 мім. IT 
Group 


Source D.F. Mean Squares Combined 
.5. 
Days... 14 58 942 13-16 1720 2134 Some DF. M ү 
06 
Пауа * 59:9 М8 9 очо 00% oo Days 18 00 
? Blocks 5 іі 
Trials 7 012 039 021 020 0:39 0:06 Trials к, 021 
а TxB 
Reinforcement 1 002 000 0-03 001 028 007 Reinforce- 0:00 
тепе 1 0:08 
RxB 5 
After-effects — | 007 002 0-7 011 045 000  After- 0:11 
effects 1 0-18 
АхВ 4 0:17 
Error 19 024 045 023 020 011 041 Error 1 
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No other effects are evident and the interactions with blocks in the combined 
analysis are not significant. 'The error mean squares are reasonably homo- 
geneous. It should be noted that in these and subsequent analyses the day 


mean squares are those that ignore after-effects. 
Тһе combined analyses of variance for all rats in the 20 sec. ITI group 


are shown in Table 5. In every case the after-effects mean square is somewhat 
inflated if not significant. The trials effect also is noticeable. Хо interactions 


with day blocks are significant. 


TABLE 5. COMBINED ANALYSES FOR Eacu Rar IN 20 SEC. ITI GROUP 


Mean Squares 


Source D.F. 

Rats... 1 2 3 4 5 6 7 
Days 23 046 1850 067 156 132 241 16 
Trials 7 087 13 134 ол 097 126 159 
T x Blocks 35 018 032 092 036 052 0-66 068 
Reinforcement 1 0-03 0-00 0-01 0-06 0:23 0:29 0:14 
Rx Blocks 5 422 007 ООО M ш 
After-effects 1 0-46 1:32 0-55 114 3:42 3:07 1:25 
A x Blocks Б 0:18 0-19 0-68 0-22 0-25 0:51 0-69 
Error 114 0-15 0-32 0-25 0-34 0-48 0-39 0-58 


TABLE 6. SUMMARY ANALYSES ОЕ VARIANCE AND MEANS FOR AFTER-EFFECTS 


12 min. ITI group 20 sec. ITI group 
OF 

Source D.F. M.S. D.F. M.S. 
Blocks 5 0:0058 5 0:0162 
Rats 9 0-0080 6 0:0075 
RxB 45 0:0076 30 0:0138 
Grand mean after-effect 0:016 0-093 
Standard error 0:011 0-018 
Mean speed ТЕК (ft. [sec.) 2:237 3411 
Mean speed TFN (ft./sec.) 2-269 2-925 


"TABLE 7. SUMMARY ANALYSES OF VARIANCE FOR TRIALS 


12 min. ITI group 20 sec. ITI group 
a 
Source D.F. M.S. D.F. M.S 
Trials 7 7:78 7 81 
Blocks 5 4-63 5 ae 
Rats 9 727 6 5:95 
Тхв 35 1:59 35 3:57 
TxR 63 1:58 42 2-3 
xt 45 0:32 30 55 


TxBxR 315 102 210 Я 
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The summary analysis of variance for the estimates of кш е фаш 
trials taken from each day block are shown in Tables 6 and 7. Т Wo voe 
evidence that after-effects changed from rat to rat or day block to SA әже 
The mean after-effect differed significantly from zero for the 20 sec. i Pa vs 
but not for the 12 min ITI group. The pattern for the three р 
former group showed an increasing differential effect between TE R an Ae 
as the animal approached the goal, an effect not shown in any way with the 

TI group. | 

i There was a marked warm-up effect over trials which was ore 
rats and day blocks as is evident in Table 7. Most of it occurred in the fir 
three trials and the pattern is slightly different for the two groups. 


. . е 
There was no difference in performance on R and N trials in any of th 
analyses. This was to be expected. 


6. CONCLUSIONS 

The design and model descri 
mental situations by demonstratin 
ment, albeit small, which the ot 


dom ;peri- 
bed here discriminated between m 
g that one gave rise to after-effects of reinforce 


P E i 1$ 
her did not. The advantage of the design м 
that it increases precision by eliminating day-to-day and trial-to-trial variation» 
both of which were certainly present. 


On the other hand, a design in the crossov: 
This assumption cannot be tested directly, 
error means squares for each 4-day block su 

It is hoped that the results presente 
to exploit special-purpose designs. 


er family assumes no interaction, 
but the relative homogeneity of t ۴ 
ggest that the assumption is senso 
d here will encourage experimenter 


A much simpler analysis can be carried out by using the means for Em 
14 ТЕК and TEN in each block of 4 days in an analysis of variance of rats Х he 
blocks x TFR-TFN. Such an analysis was done by Surridge (1967 0) and Ук i 
conclusions were the same as those reported here with the more refined mo the 
There was a loss of precision of about 5 per cent to the confounding of cy 
TFR-TFN comparison with day difference in the day-block. Тһе efficie? 
of the simpler analysi 


os 
с 5 suggests that the balance of the design is one of the ™ 
important aspects of the whole procedure. 
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THE STUDY OF HOSTILITY IN THE TEMPERAMENTS 
OF SPOUSES: DEFINITIONS AND METHODS 


By К. НОРЕ 


Medical Research Council, Unit for Research on the Epidemiology of 
Psychiatric Illness” 


The nature of aggression ог hostility is discussed, and the possible relations 
between hostility in the personalities of husbands and hostility in the personalities 
of wives are conceptualized under the heads: (a) Similarity in quantity of aggression; 
(b) Similarity in pattern of aggression; (c) Oppositeness or complementarity in 
quantity of aggression; (d) Oppositeness Or complementarity in pattern of aggres- 
sion; and (е) Systematic relations between similar or dissimilar (but not necessarily 
opposite) patterns of aggression. The first four possibilities are explored by 
imposing а model on questionnaire data obtained from alcoholics and their wives. 
Тһе fifth possibility, which is the first to be dealt with, is investigated with the 
aid of a technique which follows the pattern of the data without imposing a 
model on them. 

'T'he analysis demonstrates that exclusive reliance on correlational methods, 
however complex, may obscure the essential features of a psychological situation. 
It also shows how the special characteristics of temperamental qualities may be 
accommodated by the construction of an appropriate model. 


PART I 


1. INTRODUCTION 


Тһе relation between the personalities of husbands and wives is one which 
is constantly under scrutiny from amateur as well as professional students of 
human nature. It is widely believed that a certain type of man will tend to 
marry a certain type of woman, but the relation between the types is sometimes 
held to be one of contrast and sometimes one of similarity. It is also widely 
believed that marriage tends to modify the personalities of both parties in the 
interests of reaching and maintaining an equilibrium. The image of balance 
suggests that spouses should be complementary in some aspects of their per- 
sonality. Psychological work in the field of marriage has been well reviewed 
by Tharp (1963). 

It is, unfortunately, not always realized that ordinary correlation techniques 
fail to answer the sort of question which social psychologists should be asking 
about the marriage relationship. Indeed the product-moment correlation 
coefficient can be grossly misleading, particularly when near-zero values lead 
the research worker to deny the presence of systematic relations between the 


1 Now Lecturer in Methods of Social Research, Department of Social and ini i 
Studion Oxford, and Fellow of Nuffield College, Oxford. шаны а 
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values may, in fact, conceal a whole 
atest clinical and social interest. 


marriages have different dominant rel 
As an illustration of the question 

of answers which research workers 

findings on the nature of the relation 


$ which clinicians might ask, and the kind 
might provide, the present study reports 
between husband's aggression or hostility 
nd wife's aggression or hostility in p" 
The sample has no more claim to d 
ample which consists of self-selecte 


€ Proposed techniques 
correlation methods yields negative results, 


2. Тевтв 


i 5 z , ; Ё 
Five questionnaire tests designed to measure aspects or expressions O 


hostility, aggression or punitiveness were constructed by grouping MMP 1 
items according to their face valid of various types of hostility 


n ity as measures © 
(Caine, 1960; Foulds et al., 1960). Three of the tests were designed to measur 
intropunitive manifesta 


М T Ё re 
tions of aggression and two were designed to measu 
extrapunitive ma 


he course of а validation of this Hostility and 
Direction of Hostility Questionnaire (HDHQ, Cai 


aggression, independent of the way 10 e 
n examination of item content and of the mean a the 
€ five tests tended to confirm that the first is ao ther 
ty (AH), the second is a test of tendency to criticize e see 
d is a test of Projected hostility (that is, the tendency 


urge to act out hosti]; 
people (CO), the thir 
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hostility in other people (РН)), the fourth is a test of tendency to self-criticism 
(SC) and the fifth is a test ofguilt (G). It was clear that at least one method of 
had not been covered by the battery and that is the method of 
repression and somatization adopted by the classical hysteric. Мо doubt other 
modes of expression can be envisaged. Nevertheless, this small battery has 
proved useful for the quantitative assessment of patterns of manifestation of 
hostility, and for the assessment of changes in those patterns over time (Hope, 


1964). 


expressing hostility 


3. SAMPLE 
ed to male patients in a mental hospital unit 

Тһе wives of the patients were also invited 
to complete the battery and forty of them did so. The ward contains a high 
proportion of private or amenity beds for which the patients may pay substantial 
fees. Patients from all over Britain seek admission. Thus the sampling 
favours wealthy married men whose wives visit the hospital and are willing to 
co-operate. The data consist of the scores of forty alcoholic men on the 
hostility tests and the scores of the wives of these men on the same tests. The 
mean scores are presented in Table 1 together with mean scores on Hostility 


The HDHQ was administer 
for the treatment of alcoholism.? 


TABLE 1. MEAN Scores оғ 40 ALCOHOLICS AND THEIR Wives ON Five TESTS OF 
Новтилту AND ON Two DERIVED MEASURES 


AH CO PH SC G Hostility Direction 
Alcoholics 415 4:00 0:53 453 2-58 15:78 2-95 
Wives 3:35 2-90 0-43 433 1:90 12:90 3:88 


and Direction of Hostility. Hostility is simply the sum of scores on all five 
tests, and Direction is scored by doubling the SC score, adding the G score, and 
subtracting the other three scores. These derived measures БЕРЫ, 

respectively, to the first and second principal components of the battery (Ho " 
1963). It is usual for men to obtain a higher Hostility score and a lower ee 
extrapunitive) Direction score than women. This pattern appears here, though 
both men and women are more hostile and more intropunitive than i ind 
normal subjects reported in an earlier investigation (Caine, 1965, p. 273). 


4. NATURE OF HOSTILITY 


Several investigations have shown that there is a general factor in the 
Hostility Battery and an attempt has been made to validate it. It has, however 
been pointed out (Hope, 1963; Salmon, 1965) that the general factor ia at Tenet 
in part, a measure of willingness to admit to hostility. Scores on the first 
component of the Battery correlate approximately —0-6 with scores on the K 


2 The author is indebted to Mr А. R. Forbes and Dr J. B. Ra wi 
data which were collected by them in the course of a larger SE for peristion шаруа eese 
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: 42 f the 
does not have an item in common with the K scale. But the existence o 
correlation is almost certai 


t an artifact, since one must assume that s 
true variance, as well as the error variance, of each item is contributing bo 
to the Battery and to the K scale, 

It is conceivab 
f 


ceived little attention, even though -— 
Set has revealed j i rtant factor in almost all questionnaire pem 
pothetical example serves to illustrate t ia 
t us imagine two tests 74 and ty, and two perfectly valid criter 


А А r 
anifestations of the function 0 


S the correlations Which mi 
Тһе zero correlation betw 


ther than conjunctive, n 


ght be observed among these 
four variables, een the criteria Shows that the rit 
is disjunctive ra evertheless the tests are quite highly 
correlated. We St is a sum of two uncorrelated rp 
а general factor g which accounts for 90 per cent of the test’s variance, ore 
i maining 10 per cent. The two Speo 

ble 3 shows the relations between t 


ta tb са сь 
tq 1:00 0-81 0-32 0 
ty 1:00 0 0:32 
Ca 1:00 0 
с, 1:00 

STS 
TABLE 3, CORRELATIONS BETWEEN THE HYPOTHETICAL, Factors or THE ТЕ 
IN TABLE 2 anp THE CRITERIA 

8 5а 5% са Cb 
g 1 0 0 0 0 
5а 1 0 1 0 
5b 1 0 1 
Ca 


"WE 


"7 
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Table2. In this artificial example the whole of the tests’ validity is contributed 
by their specific factors. The variance which causes them to be highly correlated 
simply reduces their apparent validity. 

If we now suppose that g is a measure of response set, the ға and sẹ may be 
regarded as purified tests of the function. Because they are perfectly valid 
measures of uncorrelated aspects of the function, the purified tests are themselves 
uncorrelated. 

This simple artificial example raises two problems. The first is that of the 
usefulness of thinking of a temperamental characteristic such as hostility or 
anxiety as a disjunctive phenomenon. By a disjunctive phenomenon I mean a 
characteristic which can be detected in condition А, or in condition B, or in 
condition C, etc., or in any combination of these conditions, even though the 
conditions have little or nothing in common except the characteristic. That 
disjunctive phenomena exist will be taken for granted here. Wittgenstein’s 
analysis of the concept of a ‘game’ in the Philosophical Investigations (1958, 
рр. 66—76) provides a famous example. It could be argued that ‘ game’ is an 
everyday concept and it is not surprising to find the word extended, analo- 
gously or loosely, to areas which lack the essential characteristics of a game. 
Scientific concepts, it might be said, should not display such a confusing 
polymorphism. Whatever the case with scientific concepts in general there is 
good reason for not limiting psychological concepts in this way. Hostility is a 
concept of everyday life and of introspection. Many observations suggest that 
hostility manifests itself disjunctively. It may be admitted that the Freudian 
extension of the concept to include guilt went beyond current usage, but this 
extension serves only to illustrate the accession of explanatory power which 
may follow upon making a concept more, rather than less, polymorphous. 

Тһе jargon of factor analysis provides a ready-made set of terms for the 
algebraic definition of a disjunctive concept. А phenomenon is disjunctive in 
so far as it manifests itself in the specific, group or bipolar variance of a battery 
of tests which measures the phenomenon. The degree of disjunctiveness may 
vary from the extreme case of a set of high negative intercorrelations among 
(purified) tests to the minimal case of a set of high positive correlations. It 
should be noted that, although factor analysis furnishes a useful definition of 
disjunctiveness, the presence of polymorphism in the phenomena interdicts 
the usual uses of factor analysis in the construction of tests or batteries. 

In addition to the problem of polymorphism, the artificial example of 
Tables 2 and 3 raises the problem of the nature of response set. How far is 
the general factor? of the HDHQ a measure of hostility and how far is it a 
consequence of response sets? "Тһе alternatives of this question may not, in 


з It is unfortunate that ratings of hostility may be expected to prod i 
artificial general factor, which cannot be taken as evidence for s pecie pe rali ed 
aggression. It is also unfortunate that human perceptions of temperamental characte ati | ut 
be assumed to be related to human needs. 'lhus one would expect ratings of е 
manifestations of aggression to be more valid than ratings of intropunitive manif t oh Буш 
the observer has a built-in need to detect the former but not the latter, жашынан 
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fact, be reciprocally exclusive, Let us su 


ppose that response set is a manifesta- 
tion of ego mechanisms whereas hostility 


2 А "тх Р n 
15 a drive. Then we have to concer 


: : ipe zen 
perennial psychological problem of defining and delimiting arcas of pane 
such as aggression, anxiety or Sexuality. It is commonly assumed that an 


: ее а 
› typically, promotes activity in that area, It may 
however be that an area or functio 


not only, by peculiarity to it of s 


organization of (ego) mechanisms w 
function or area,4 i 


component which is a measure of iege 
zero level of hostility there must be = 
Pressed to any appreciable extent T 
ty almost certainly determines respon 


: vel 

y determine hostility. A habitual leve 

may, in time, modify the leve] of the functi 
who is not accusto i 


y “work himself up’ to a constant a 
of aggression, both i iti extrapunitive, which Bppears to pe 
It may be suspected that, et 
rrelated with the response $ е 
©, and so the inclusion of a meas 

component. However, since % 
onse sets can achieve quite different organizations п“ 
5 and boundaries of areas of functio 


. B е is ose 
measuring the function which the item is ку , 
to measure) тау be correlated w; non-valid variance of T test 
боп for rejecting correlational techniques 0 


| | day 
Sinc aggression ’ or < hostility ’ are derived from so à 
English it may be asked wh junctive theory of aggression PT 
S0. Whereas intelligence is а 


таса E 
қ Н y ression 
em only in certain respects, The use of GE aes fro! 

i uilt is of course a departu 

“T came across a clear ex; 


PSychiatrist to be of 


I tt : Paranoid 
while obtaining Positive scores, 


ы his 
1 В atient, said Poia 
ample of such a specific mechanism when a р; ts of para 
ersonality, gained zero Scores on four questionnaire tes i 


5 lity- 
5 оп 12 ош of 14 tests of other types of personality abnorma 


The Study of Hostility in the Temperaments of Spouses 73 


ordinary usage and is derived from Freud. Compelling testimony to the 
existence of inwardly directed aggression is provided by Menninger (1938). 

We may summarize the discussion of aggression and the HDHQ by 
enumerating probable or possible factors. There is first of all a possible small 
general factor of aggression, which correlates with the first component of the 
battery. This component also measures a factor of response set, which may be 
general to the battery and which may be partially determined by level of hostility, 
thus contributing to the validity of the component. ‘There is an extrapunitive 
group factor and an intropunitive group factor and these present themselves in 
the form of a bipolar component. There may also be specific factors (which 
would not manifest themselves as such in a principal component analysis (Slater, 
1964)) which contribute to the validity of the battery. ‘The specific, group, or 
bipolar factors may conveniently be lumped together under the heading 
' disjunctive factors ’. 


5. CORRELATION ANALYSIS 


Те correlations for all possible pairs of test scores of husbands and wives 
are shown in Table 4. Тһе correlations within husbands and the correlations 
within wives show the usual pattern for this battery. Тһе correlations between 
husbands and wives are all very close to zero. The significance of the regression 
of one set of scores on the other may be tested by calculating the x? associated 
with a canonical correlation analysis® (Bartlett, 1947) The value of Wilks's 
lambda is 0-5726, giving a x? of 19-24 with 25 degrees of freedom. Since P 
lies between 0-90 and 0-75 we have no evidence of regression. When this x? 
is split into five additive parts, one associated with each canonical correlation 
none of the five attains the 25 per cent level. { 

Те first two canonical variates are reported in Table 5. It has been 
demonstrated (Kendall, 1957, p. 74) that a regression equation is an unstable 
dimension if there are collinearities among the independent variables. However 
the smallest latent root of either the alcoholics’ or the wives’ correlation matrix 
is the fifth root of the alcoholics’ matrix and accounts for 4-5 per cent of the 
variance. This, together with the stability of the components from sample to 
sample (Hope, 1963) is sufficient to show that collinearities are unlikely to occur 
in the correlation matrix of these tests. It happens that a similar (unpublished) 
analysis of the hostility scores of another sample of alcoholics and their wives has 
yielded a significant first canonical variate which is similar to the second canonienl 
variate of the present analysis. This finding is parenthetical to the main argu- 
ment of the paper but it is introduced to show that the near-zero correlations 
of the present data are only an extreme example of what may be no more than a 
tendency towards low correlation coefficients. 


, 5 Analysis by canonical correlations was first described by Hotell: 5 У 
(1947) introduced British psychologists to the method. A Mies epe eoe ped nm 
analysis is given by Hope (1969). шағы ыы 
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Тавів 5. First Two CANONICAL VECTORS RELATING THE PATTERNS OF 
HosriLITY OF ALCOHOLICS AND THEIR WIVES 


First variate Second variate 


ee 
Alcoholics Wives Alcoholics Wives 
AH 0-19 0-45 —0-02 0:26 
co — 0:61 — 0:07 0-63 —0:55 
Tests PH 0:19 —0:81 —0:36 0:33 
SC —0:29 —0:33 —0:51 -0:21 
с 0:69 0:17 0:46 0:69 
Canonical correlation 0:5138 0:3765 
xi 10:57 5:27 
d.f. 9 5 


Тһе research worker might be tempted to conclude that there is no associa- 
tion between a wife’s hostility and that of her husband. But only one kind of 
association has been put to the test. Furthermore, even if an association had 
been established, we should have been faced by the notorious difficulty of 
interpreting regression analyses (Hope, 1969, passim). Іп Part II of this article 
the concept of association is broken down into various kinds of association, andit 
is shown how, by imposing a model on the data, the degree to which each type 
of association occurs can be readily estimated. 


PART II 


1. INTRODUCTION 
In the first part of this study the scores of a sample of male alcoholics and 
their wives on five measures of hostility or aggression were correlated, and it was 
shown that there was no significant regression of one set of scores on the other. 
It was pointed out that the regression analysis does not exhaust the possible 


relations between husbands’ and wives’ hostility. 


2. METHOD 


In order to examine further relations of clinical and psychological interest 
the scores were analysed according to a three-factor factorial design in the analysis 
of variance which is, in its early stages, an exact copy of that illustrated by Burt 
(1955) and Mahmoud (1955). The three factors are: (a) couples, that is the 
differences among the mean aggression scores of the married pairs, each mean 
being derived from 10 scores: the five scores of the husband and the five scores 
of the wife; (b) tests, that is the differences among the five mean scores on the 
tests, each mean being derived from the scores of all 80 persons on that test; and 
(c) sex of spouse, that is the difference between the mean score of all 40 men 
all five tests and the same score for the women. p 
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two other persons? Is the comparison anything more than the statistical 
elimination of irrelevant variance? The model which I propose here assumes 
that the comparison makes sense. This assumption is a psychological assump- 
tion about the nature of the family and the familial occurrence of hostility. 

As far as this point the calculations have been formally identical with those 
of Burt and Mahmoud. But it is not possible to adopt their model because it 
cannot be assumed that all three factors of the present study are random. In 
deciding whether an effect is fixed or random, some decision must be made on 
the purpose for which the analysis is being carried out. It should be clear that, 
from our analysis, we wish to arrive at a general statement about a population 
of marriages, rather than a statement which is specific to our sample from that 
Тһе factor of couples is therefore a random effect. If the study 
were to be repeated it would be applied to a fresh sample of marriages. In such 
a repetition there could be no random sampling of the two levels of sex of spouse, 
and so sex must be regarded as a fixed effect (Eisenhart, 1947). 

In practice, if the study were repeated, the same five tests would be 
employed, and this suggests that tests is a fixed effect. But we are interested 
in generalizing to hostility, and not merely in predicting response to those 
particular tests. If tests constitutes a random effect then, it may be said, it 
should be possible to conceive of the five tests as a random sample from an 
infinite population of questionnaire tests of hostility, (cf. Plackett’s (1960) 
review of analysis of variance models and Pilliner’s (1965) discussion of the 
application of such models in psychology). The concept of an infinite popula- 
tion of questionnaire tests is not an easy one to handle. Test theorists sometimes 
speak as though the only limitation on the possible output of test items is the 
unimaginativeness of the test-constructor. But items are not simply random 
concatenations of words. The possibilities are limited by the conceptual 
limitations of test-respondents, by cultural factors, and by the need to maintain 
a satisfactory degree of discrimination. If the theorist admits that the number 
of possible items is finite he can, nevertheless, argue that infinitely many tests 
can be constructed by repeated sampling of items, that an item is not constant in 
all its occurrences but is modified by the nature of the other items of any test in 
which it occurs, and that, even if no infinite population is available, the sampling 
fraction® can be made vanishingly small. 

Although it is possible to construct tests by random sampling of items, the 
five tests of the HDHQ were not constructed in this manner. If the production 
of the HDHQ were to be notionally attributed to an hypothetical process of 
sampling, the method invoked would be that of stratified, rather than simple 


population. 


* A general formulation of a two-factor, row X column (R x C) design speci i 
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is no reason to suppose that such a sampling procedure would exalt the general 
factor at the expense of disjunctive factors. 

The model for the analysis is therefore a completely-crossed design with 
two random factors, namely couples and tests, and a fixed factor, namely sex of 
spouse. The expectations of the mean squares which are consequent upon these 
assumptions (Snedecor, 1956) are given in Table 7. In this table and in the 
subsequent formulae capital letters represent variance components? and lower- 
case letters represent multiplying factors. Thus the term sCT represents the 
variance attributable to the interaction of couples and tests, multiplied by the 
number of levels for sex, which is two. E represents the variance attributable 
to error. Because no independent estimate of E is available its maximum and 
minimum values are given in Table 7. Immediately above these two numbers 
are the corresponding minimum and maximum for CTS. The sum of E and 
CTS is given by the mean square for CTS. The variance components C and 
CS may be directly estimated, but CT has a minimum and maximum which 
depend on the corresponding values of CTS and E. It should be understood 
p these minima and maxima are limits of point estimates, and not confidence 
imits, 


TABLE 7. COMPONENT ANALYSIS OF THE MEAN Squares OF TABLE 6 ON THE 
ASSUMPTION OF A Міхер MODEL (COUPLES AND TESTS RANDOM, SEX or SPOUSE 


FIXED) 

Source M.S. Expectation Variance component 
Couples 2:431749 E+sCT+tsC 0-178313 
Couples x Tests 0-648618 E+sCT 0:011901 0:324309 
Couples x Sex 2474522 E+CTS+tCS 0:369941 
Couples x Tests x Sex 0624815 E+CTS 0-000000 0:624815 
Error = Е 0-624815 0-000000 


4. SIMILARITY OF SPOUSES 


The first measure of similarity which we shall calculate i 4 
terms of variance components and the number of tests ¢ as fuis Sie) 
саст 
ҚС--С8)-СТ-- В” 

where R=E+CTS. 

Single estimates are available for all the terms of the for 
4 dt mul 

When CT takes its minimum value (Table 7) the value of the fene КОТ 

and when CT takes its maximum value the value of the formula is 0:33 This 

coefficient, which has been variously called an intraclass correlation coeffici " 

(Burt, 1955), a coefficient of external consistency or equivalence (Mahmoud 
, 


? The term is used loosely to refer to both fixed and random effects. 
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similarity or dissimilarity to each couple. In some studies it may be appropriate 
to measure the similarity of spouses in the modified space of the previous para- 
graph. But it is more likely that the appropriate space will be the original space 
of the analysis. 

If we imagine a person’s ego-mechanisms as a set of switches, which 
determine the lines along which current shall flow, then the coefficient of pattern 
similarity measures the resemblance between spouses in their setting of the 


switches. No account is taken of the amount of current flowing. 


5. COEFFICIENTS OF SIMILARITY AND REGRESSION ANALYSIS 
esemblance differ from the canonical 
correlations which were reported in Part I in that the canonical correlations are 
maximized values whereas the coefficients are estimates. The coefficients are 
subject to the restriction that every test is given an equal weight whereas the 
weight of a test in a canonical variate may take any value. Further, in the calcula- 
tion of the coefficients each test applied to the husbands is paired with the same 
test applied to the wives. No such restriction is imposed in the regression 
analysis. In consequence high positive similarity coefficients entail high canoni- 
cal correlations but high canonical correlations do not entail high similarity 
coefficients. The interpretation of the coefficients depends, of course, on the 
assumptions which we make about the nature of hostility. 


The coefficients of similarity or r 


6. THE NATURE OF COMPLEMENTARITY 

In ordinary usage A and B are said to be complementary if, taken together, 
they sum to a whole (the primary usage), or if they cancel one another out (the 
secondary usage). When A and B are angles the whole is a right angle, when 
they are colours the whole is a neutral colour. Translated into arithmetical 
terms the two usages differ mainly in that A and B have the same sign in the 
primary case but are of opposite sign in the secondary case. The secondary use 
also assumes the equality of A and B in absolute magnitude. Analogously, two 
persons may be said to be complementary if (a) each contributes specific skills 
and attitudes to a task which, on this account, can be accomplished by the two 
together but by neither alone, or (0) they are opposite in a certain trait, that is the 
sum of the trait positions of a number of pairs of complementary persons is more 
or less constant, although the difference between the members of each pair is 
considerable. Broadly speaking, we may identify the primary usage as comple- 
mentarity of persons and the secondary usage as complementarity of traits. 

The primary usage is teleological; it envisages complementarity of traits 
as complementarity for something. This purposive element has been introduced 
into some definitions of complementariness of personality which nevertheless 
represents a retreat from ordinary usage. Udry (1963), for example, cites the 
following definition: ‘ Trait X is complementary to trait Y if anyone with trait 
X 5. be gratified by trait Y in a mate and anyone with trait Y will be gratified 
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domination, conflict, alcoholism, or psychosis. It thus becomes possible to 
examine the importance of complementarity as a cause of any of these conse- 
quences. 


7. COMPLEMENTARITY OF SPOUSES 
In order to measure the importance of the differences between husbands and 
wives in quanta of aggression we require another coefficient, called by Mahmoud 
(1955) the coefficient of trait variability: 
1С5 
К(С+С5)+СТ+К 


lies between 
1-849705 1-849705 
3377986 0 3:690394° 


‘Trait variability is 0-55 if there is no third-order interaction component, and 
0:50 is there is no error. ‘These values do not reflect a negative correlation 
between husbands’ and wives’ hostility in the usual sense. They indicate that 
in some cases the wife is more aggressive and in some cases the husband. Strictly 
speaking, the finding relates only to quantity of aggression relative to the average 
wife or the average husband. "Тһе large (and highly significant; Ё=3-96; 
d.f. = 39,156; Р<0-001) variance of the couplesxsex term may reflect the 
distinction between two patterns of alcoholics’ marriages hypothesized by clini- 
cians: that in which the wife is dominant and coercive, and that in which she 
is submissive and weak. It may however simply reflect a diversity of marital 
situations which alcoholics share with non-alcoholic husbands. 

Тһе sum of squares of the couples x sex interaction term is derived by 
squaring the entries in a matrix of interaction scores. Each person has one 
score in this matrix and a wife's score is necessarily the same as her husband's 
except that it is opposite in sign. The sum of squares of the couples term is 
similarly derived from a vector of means in which each value represents the 
average level of hostility (after partialling out sex differences) in the marriage, 
A graph of (a) husban ds’ interaction scores against (b) average level of hostility 
in the marriage reveals a tendency for husbands to be more hostile than wives 
if the total hostility in the marriage is high, and for wives to be more hostile 
than husbands if total hostility is low. The product-moment correlation 
coefficient between (a) and (b) is 0-29 and is not quite significantly different from 
zero (two-tailed test). The correlation between (A) Hostility score of husband 
minus Hostility score of wife (both calculated from the simple sum of raw scores 
on all five tests) and (B) Hostility score of husband plus Hostility score of wife 
is also 0-29. Since (A) correlates 0-98 with (a), and (B) correlates 0-99 with (b) 
it is evident that the association between factors which is suggested by the 
analysis of variance can be approximated by a much simpler analysis of raw scores. 
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factor between spouses falls short of it. No account is taken of any disjunctive 


factor between spouses. Anomalous values may arise if the general factor within 
spouses is small. 


The relations between the coefficient of trait variability and the coefficient 
of person stability are such that an explication of them is a source of considerable 
conceptual clarification. If the coefficient of Person stability is positive and high 
the coefficient of trait variability must be close to zero. 
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persons poor in Hostility (trait variability 240-5) but it remains true that some 

marriages have more hostility than others (person stability = 0:33). Neither of 
* these factors is revealed by an ordinary correlation coefficient. "The correlation 

between husbands' and wives' Hostility (calculated as in Table 1) is 0-03. 


8. COMPLEMENTARITY OF PATTERN 


So far each of the first three terms of the analysis of variance has supplied 
the numerator of a coefficient, and we are left with the second-order interaction 
term as a measure of the tendency of a husband's pattern of aggression to be 
opposite or complementary to that of his wife. No test of significance is possible 
for this term because it is a compound of error and interaction variance. Тһе 
nature of the second-order interaction term may be illustrated by giving the 
scores of a few of the couples after all the main and first order interaction effects 
have been partialled out. The sum of squares for the couples x tests x sex 
interaction is simply the sum of squares of all these scores over all couples. The 
second-order interaction scores of the four couples in Table 8 have been chosen 


' TABLE 8. SCORES or Four SELECTED COUPLES ON Five HOSTILITY TESTS AFTER 
ALL EFFECTS EXCEPT THE SECOND-ORDER INTERACTION EFFECT HAVE BEEN 
PARTIALLED OUT 


Alcoholic Wife 
Я у 

Test AH CO PH 5С G AH СО . PH SC G 
Couple 

A -0:69 -013 -034 073 042 0:69 0:13 034 -073 -042 
B: 043 022 0-54 —0:54 — 0:65 -043 -022 —0:54 054 065 
б 0:17 —059 079 019 —0:56 -0:17 0-59 -079 —019 0:56 
р 013 0:83 — 1:59 0:32 0:30 — 0:13 — 0:83 1:59 —0:32 —0:30 


to illustrate two patterns of relationship. Іп marriage А the husband is low on 
the extrapunitive tests and high on the intropunitive, that is, he is at the intro- 
punitive end of the direction of hostility dimension. His wife is at the extra- 
punitive end. In marriage B the converse holds. In marriages C and D a 
different dimension emerges which is principally defined by low CO and high 
PH. This might well be a reflection of the realities of the marriage situation 
since if one partner is critical the other feels ‘ got at’. The critical partner does 
not feel threatened while the threatened partner is not critical. In marriage C 
the wife is critical and the husband-is threatened while in marriage D we see 


| the converse. В . Я | 
Тһе second-order interaction scores necessarily show the ways in which 


alcoholies and their wives are opposite or complementary. "Тһе scores of one 
partner are a mirror image of the scores of the other. A principal component 
analysis of the second-order interaction score matrix of either the men or the 
women will therefore show the апап patterns of oppositeness or complemen- 
tarity which exist in the sample (it will not, of course, yield a test of significance). 


$ S.P. F2 
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Тһе correlations between the interaction scores of the alcoholics are shown 1n 


Table 9 and their components аге shown in Table 10. (The components of 
the variance-covariance matrix are practically identical with those of the correla- 
tion matrix and so there is no need to argue the merits of the two types of analysis 


here). 


"TABLE 9. CORRELATIONS BETWEEN SECOND-ORDER INTERACTION SCORES OF 40 
ALCOHOLICS ow Five Tests or HOSTILITY 


AH со РН 5С G 
AH 1:00 —0:07 —0-10 -041 —0:45 
со 1:00 —043 —0:54 —0:32 
PH 1:00 -043 —0:50 
Sc 1:00 0:47 
G 1:00 


TABLE 10. PRINCIPAL COMPONENTS OF CORRELATIONS BETWEEN SECOND-ORDER 
INTERACTION Scores or 40 ALCOHOLICS ON Five TEsTs оғ Новтплтү (TABLE 9) 


Components 
1 2 3 4 5 
AH 0:34 —0:05 0:82 0-12 0-44 
со 0-33 — 0:70 — 0:39 —0:18 0:46 
Tests PH 0:36 0:68 — 0:39 0:313 0:49 
Sc —0:58 0:15 0:09 -0:65 0:46 
G —0:56 —0-16 —0:08 0:72 0:37 
Latent root 2:27 447 1:08 0:49 0:00 
Variance (94) 45:42 23:31 21:55 9:72 0:00 


{ А comparison between the correlations in Table 9 and the correlations of the 
original scores (Table 11) shows what the preceding steps in the analysis of 
variance have done to the data. Тһе original score analysis yields the usual 
pattern of correlations among all the tests, the correlations being, on average 
higher within the extrapunitive and intropunitive categories than between them. 
As a check, the components of the original scores of alcoholics and their wives 
were calculated separately. One component matrix was then multiplied by 
the transpose of the other in order to yield the matrix of cosines between pairs 
of vectors in the test space. None of the cosines between corresponding com 
ponents was less than 0-90, and the components of the two samples were very 


ABLE 11. 
TABLE 11 чава BETWEEN SCORES or 80 Persons (40 ALCOHOLICS AND 
EIR Wives) on Five Tests or HOSTILITY 


Ке АН со PH Sc @ 

GÓ 10 045 0:33 0:20 0:37 
PH 1:00 0:51 0:28 0:49 
SC 1-00 0-25 0-28 
С 1-00 0-66 


n 
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similar to the components of the joint sample shown in Table 12. Stability 
of structure from sample to sample is a constantly recurring feature of the HDHQ 
(Hope, 1963; Slater, 1964). The effect of the analysis of variance has been to 
remove the general factor leading to positive correlations; it has also removed 
the extrapunitive group factor, and it has emphasized the negative relation 
between extrapunitiveness and intropunitiveness. The intropunitive group 
factor remains largely untouched. 


TABLE 12. PRINCIPAL COMPONENTS OF THE CORRELATIONS OF THE SCORES OF 80 
Persons ON Five Tests or Ноѕтплтү (TABLE 11) 


Components 
1 2 3 4 5 
AH 041 0:35 0:74 039 —008 
со 0:49 0:32 —0-08 —0:72 —0:37 
Tests PH 0:41 0-4 —0-63 0-43 022 
5С 0-42 — 0:63 -0:15 0:31 —0-55 
G 0:50 -042 012  —0:23 07 
Latent root 2:55 1:01 0-69 0-48 0-27 
Variance (%) 50-91 20-28 13:73 9-63 543 


Тһе first principal component of the correlation between second-order 
interaction scores (Table 10) is, quite clearly, the component which emerges as 
the second component of analyses of the correlation matrix of raw scores on 
these tests. It is a measure of the direction in which a man turns his hostility, 
either outwards against others or inwards against himself. 'There is a very 
close relation (cos 0 = 0۰9857) between this component and the second component 
of the correlations between original scores (Table 12). Тһе extremes of the 
dimensions are exemplified in the scores of couples A and B in Table 8. Alco- 
holic А has a score of — 2:19 and alcoholic В has a score of + 2:29 on this 
component. Their wives obtain the same scores with signs reversed. It is 
emphasized that the existence of this component as an important source of 
second-order interaction variance does not reflect a negative correlation between 
the direction of hostility of husbands and wives. "Тһе correlation between the 
Direction scores of husbands and wives (calculated as in Table 1) is — 0:07. 

'The second component of the interaction score matrix has some relation 
(cos 0 —0-8611) to the fourth component of the original score matrix and is 
exemplified by alcoholics с апа D, who obtain scores of 2:05 and —3-15 
respectively on this dimension. There are fairly clear relations between the 
remaining components of the interaction scores and the remaining components 
of the original scores, and these are represented by cosines of 0-8689 (third 
interaction component — third original component), 0-9549 (fourth interaction 
component — fifth original component) and 0-9862 (fifth interaction component = 
first original component). 4 

It should perhaps be explained that the general factor emerges as the fifth 


component of the second order interaction matrix because, in the successive 


stages of partialling out this factor, the analysis of variance gave the same weight 
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to each of the five tests. This is not quite as effective in сае the деш 
factor as the method of differential weighting which is employes 2. а 
components. Тһе present fifth factor is the residue of the genera ac 
and above the part extracted by the application of equal weights. РЕЧЕ 
In the absence of a test of significance one cannot claim to have establishe 
that alcoholics and their wives are opposite or complementary in their реп 
of expression of aggression or punitiveness. But if they are opposite Hen the 
most important dimension contrasting them is the dimension of the ieri 
in which they turn their hostility. Another important dimension is one whic 
contrasts the non-critical but threatened person with his or her critical, non- 
threatened spouse. 
It should again be emphasized that the analysis does not imply that there 
is a systematic tendency for the temperament of husbands to have a particular 


negative relation to the temperament of wives. Both husbands and wives are 
fairly evenly distributed along each dimension, 


so that there is no concentration 
of husbands at one end of a continuum with wi 


ves at the other. 


9. COMPARISON WITH ANOTHER SAMPLE OF MARRIAGES 


The population sampled by the present data is not typical either of alco- 
holics or of marriages, and the sampling depends on the willingness of two parties 


to co-operate. It would not be possible to define an adequate ‘ control’ 
population for purposes of comparison, 


Nevertheless it is of interest to look 
at any normal sample which may be available, to see how their similarity and 
complementarity compare with those of the alcoholics. The Edinburgh 
Marriage Guidance Council!0 


TaBLe 13. M 
SPOUSES on Е 


D ON Two DERIVED Measures (CF: 
TABLE 1) 
band: = 2 PR 5С G Hostility Direction 
Husbands 32 3-02 0:57 e j 
i 1-45 12-28 2-36 
Wives 3413 2:47 0:51 443 1-45 11:98 4:19 


10 I wish to express тау thanks to ы 4 4 
> the Coun s. for 
their ready consent to my request for these aid cil, and to the counsellors and their spouse: 


-— 
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that all the alcoholics are men, whereas most of the counsellors are women (in 
some cases both husband and wife are counsellors). Both groups are drawn 
largely from the middle and upper middle classes. А comparison between the 
means reported in Table 13 and the means of alcoholics and their wives (Table 1) 
reveals only one difference of note, and that is that the alcoholics are somewhat 
more hostile than the husbands of the counsellors. 

The regression of the husbands’ variables upon the wives’ variables yields 
a chi square of 38:15 with 25 degrees of freedom (P < 0:05). "Тһе first canonical 
correlation is 0-58, compared with 0-51 for the alcoholics, and is significant 
(3 71717). The results of applying the analysis of variance model, with 
exactly the same assumptions as in the case of the alcoholics, are reported in 
Table 14. and the coefficients for the two samples are set out side by side in 
Table 15. 'The purpose of the table is primarily illustrative, partly because 
neither of the groups represents any general population, and partly because the 
groups differed in the conditions of administration of the tests. 


TABLE 14. "I'HREE-FACTOR ANALYSIS OF VARIANCE OF THE SCORES OF MARRIAGE 
GUIDANCE COUNSELLORS AND THEIR SPOUSES ON Five Tests or Ноѕтилтү 
(cf. TABLES 6 AND 7) 


Source d.f. S.S. M.S. Variance component 
с 46 149:665193 3:253591 0-242205 

ст 184 153:004096 0:831544 0:152180 0-415772 
Cs 46 60۰328657 1311493 0۰156862 

CTS 184 97:002055 0-527185 0-000000 0527185 
Error = — — 0-527185 0۰000000 


TABLE 15. COEFFICIENTS OF SIMILARITY AND COMPLEMENTARITY IN THE 
MARRIAGES OF ALCOHOLICS AND MARRIAGE GUIDANCE COUNSELLORS 


(Where two coefficients are reported the first is based on the assumption of zero 
second-order interaction, and the second is based on the assumption of zero error.) 


Coefficient Alcoholics Marriage Guidance Counsellors 
Stability 0:27 0:33 0:51 0-55 
Person stability 0:33 0-61 
Pattern similarity 0-02 0:34 0-22 044 
0:55 0:50 0-29 027 


Trait variability 


10. CONCLUSIONS 


The second part of this article has shown that common questions concerning 
the personality dimensions of marriage can be refined, restated, and submitted 
both to tests of significance and, more interestingly, to techniques of estimation. 


The method of analysis can, of course, be generalized to any relationship and to 


any number of relata provided that the number of relata is constant. 
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Тһе evidence of the sample of alcoholics is that husbands and wives resemble 
one another to some extent in amount and pattern of expression of aggression. 
There is a hint in the analysis that, as the total quantum of aggression in the 
marriage increases so does the husband tend to have the more aggression. 


This phenomenon occursin spite of the standardization of tests scores for husbands 


and wives separately. 

Two factors of oppositeness or complementarity have been explored. One 
has been shown to have some importance. The other could not be tested for 
significance, but an examination of its structure lent it plausibility. Both factors 
have the property that they tend to reduce correlations between characteristics 
of husbands and characteristics of wives. In consequence a search for a relation 
by means of a technique (canonical correlation analysis) which does not impose 
the conditions of a factorial design in the analysis of variance led to a non- 
significant result. Р 

Тһе significant factor revealed a moderately strong tendency for married 
partners to differ in their quanta of hostility: sometimes the wife predominates 
and sometimes the husband. The plausible but unproved factor reveals a 
tendency for extrapunitive persons to have intropunitive spouses. But again 
there is no reason to suppose that one end of the dimension is more densely 
inhabited by either sex. The same factor also has a dimension contrasting 
persons who feel ‘ got at’, and who do not voice criticisms of others, with their 
spouses who are critical of others but do not feel threatened. 

Тһе purpose of the analysis has been to illustrate a method rather than to 
provide substantive findings. Nevertheless it is important to observe that the 
method offers the possibility of making comparisons between different types or 
styles of marriage. From the two reported analyses it appears that the resem- 
blance between marriage guidance counsellors and their spouses in amount an 
pattern of hostility exceeds the resemblance between alcoholics and their spouses. 
If we accept that hostility is a hydra-headed phenomenon this finding may be 
summarized by saying that counsellors and their spouses resemble one another 
more than do alcoholics and their spouses. 

It has been shown that complementarity or oppositeness, as well as simi- 
larity, may be quantified. Although the analysis discounts the fact that husbands 
tend, on average, to be more hostile than wives, it remains true that the dis- 
crepancy between the mean hostility of a husband and that of his wife is greater 
among alcoholics than it is among marriage guidance counsellors. 

it is, perhaps, desirable to utter two caveats. The first is that the present 
findings have practically no bearing on questions about assortative mating: 
Only a prospective study could adequately test the effects, if any, of temper?" 
mental characteristics on choice of spouse. "Тһе second is that shared or anti- 
thetic relations cannot be confined within a single area of temperament. It is 


quite probable that there are, for example, systematic relations between the 
aggression of one spouse and, say, the anxiety of the other. 


The Study of Hostility in the Temperaments of Spouses 91 


11. Summary 


Possible models of hostility, punitiveness or aggression were explored in 
Part I and it was argued that specific and group (disjunctive) factors may be an 
important source of test validity, particularly in the field of temperament. 
Both general and disjunctive factors require investigation as possible sources of 
similarities or complementarities between husbands and wives. 'The implica- 
tions of disjunctive factors for customary techniques of item analysis and test 
construction have been pointed out ez passant. 

In Part II several possible questions concerning the marriage relationship 
were considered, and a coefficient or method of analysis was employed to answer 
each question. The coefficients are defined as algebraic ratios, but the concep- 
tual structure of each coefficient has been set out verbally, and the meanings of 
different possible relations between coefficients have been explored in psycho- 
logical terms. It is shown that attempts to redefine the concept of spouses' 
complementarity are both misleading and unnecessary, and that complemen- 
tariness, in its ordinary sense, may be studied quantitatively by means of readily 
applicable techniques. 


APPENDIX 
The Random Model and Analysis by Correlations 

It has been pointed out that the scores analysed by the analysis of variance 
technique are identical with the centred and standardized scores which are 
employed in the calculation of the correlations of Table 4. In the special case 
of a completely-random three-factor model, which is illustrated by Burt (1955) 
and Mahmoud (1955), certain interesting relations between the correlation 
matrix and the variance components may be established. Тһе relations are not 
only valuable in that they provide an alternative means of calculating the 
variance components from the correlations, but they also enable us to see the 
meaning of the coefficients of similarity and complementarity in terms of 
correlations. 

For the sake of illustrating these relations (and also as a way of emphasizing 
the practical importance of choosing the right model) we shall apply the random 
three-factor model to the data of the alcoholics’ analysis. The expectations of 
the mean squares for this model (Model II) are given by Snedecor (1956). 
Applying these equations to the mean squares of Table 6 gives us the following 


components: 
C — 0-006658 
CT 0-011901 
CS 0-369941 
rt 0-624815 


Тһе components have the property of summing to a total variance of unity. 
Under the assumptions of a completely random model it is not necessary to 
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distinguish between CTS and E in order to arrive at a single estimate for each 
coefficient. The coefficients, defined as in the text, are: 


Stability — 0-008718 
Person stability — 0-018326 
Pattern similarity — 0-018692 
Trait variability 0-754017 
It can be seen that they differ substantially from the coefficients which emerge 
from a more appropriate model (Table 15). . 
Mahmoud has illustrated the relations between this type of factorial analysis 


of standardized scores and the correlation matrix (Table 4). The latter may be 
represented schematically by a pooling square: 


1 Ts Tsa Ta 
Ts 1 7а Фа 
tsa 74 1 % 


ға Фа ?, 1 | 
Тһе 2x2 submatrices stand for square matrices of any order. Тһе units 
represent the diagonal elements of Table 4. 7, represents the average of the 
correlations between different tests applied to the same person (whether husband 
or wife). Thus #;=0-3633 for Table 4 is the arithmetic mean of 20 coefficients: 
the 10 off-diagonal elements of the husbands’ submatrix and the 10 off-diagonal 
elements of the wives’ submatrix. 754—0-0052 is the mean correlation of the 
same tests applied to different persons, and is the mean of the five diagonal 
elements of the upper-right submatrix. 7g= —0-0067 is the mean of the 20 
off-diagonal elements of this submatrix. 


Тһе proportion of the total variance attributable to each term in the analysis 
of variance is as follows: 


Term Formula 

с Ta 

CT ўза — ?а 

CS 7;— fa 

CTS 1-%4-%--?а 


It may easily be verified that the above formulae have the same numerical values 
as the variance components calculated from the factorial design in the analysis 
of variance. 

Тһе interpretation of the various coefficients of similarity and complemen- 


tarity is facilitated by expressing each coefficient in terms of elements of the 
pooling square (t is the number of tests): 


Coefficient Formula 
Stability | [75a + (£ — 1)а]/[1 + (t — 1)Fs] 
Person stability Ta|Ts 


Pattern similarity Tsa — Ya)| (1 — 7. 
Trait variability iu ү, P 


(75 ға) [1 + (@— Lis) 
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Тһе fact that the pooling square consists of four 2x 2 matrices indicates 
that the model treats any pair of tests as falling into one or other of the categories 
‘same’ or ‘different’. Тһе interpretation of the coefficients may therefore 
be simplified by assuming only two tests, X and Y. 

The coefficient of stability is an index of the strength of the association 
between husbands’ X + Y and wives’ X + Y. Тһе denominator of the coefficient 
is an estimate of the combined strength of presumed general and disjunctive 
factors of hostility within the group of husbands and within the group of wives, 
and so the strength of general and disjunctive factors between spouses is quanti- 
fied in terms of the strength of the same factors within spouses. "The coefficient. 
may achieve plus unity if husbands’ X is correlated with wives’ X, etc., or if 
husbands’ X is correlated with wives’ Y, etc. It may become negative because 
a disjunctive factor within spouses is reflected inversely between spouses. 
The coefficient is near zero if both (а) the correlations between husbands’ X and 
wives’ X, etc., are close to zero, and (0) the correlations between husbands’ X 
and wives’ Y, etc., are close to zero. These two conditions are satisfied by the 
present data and so the coefficient of stability is, virtually, zero. 

Тһе coefficient of pattern similarity is a much simpler statistic than the 
coefficient of stability. It is a measure of the strength of the association between 
husbands’ X — Y and wives’ X— Y. If husbands’ AH is correlated with wives’ 
AH but not with wives’ CO, PH, etc., and if this pattern holds for all five tests, 
then pattern similarity is positive and may be perfect even though person 
stability is zero. I chose the name ‘ pattern similarity’ because a positive 
value indicates that husbands and wives tend to share the same pattern of expres- 
sion of hostility. The coefficient measures the tendency for husbands and wives 
to share a profile over the five tests, irrespective if any difference in the absolute 
amount of hostility in the marriage, and irrespective of any difference between 
husbands and wives in absolute level of hostility. 

The denominator of the coefficient is a measure of the strength of presumed 
disjunctive factors within spouses. The coefficient therefore quantifies the 
strength of a disjunctive factor between spouses in terms of a presumed disjunc- 
tive factor within spouses. There is certainly no evidence of a between-spouse 
disjunctive factor in the present data (on the assumption of Model П) In 
other words, even though a person’s expression of hostility tends to be specific 
to certain manifestations, the mode of specificity is not shared by spouses. 

The coefficient of person stability assesses the strength of the association 
between a presumed general factor in husbands and the same general factor in 
wives. The denominator is an estimate of the strength of the general factor 
within husbands and wives, and so the strength of the between-spouse factor is 
quantified in terms of the strength of the within-spouse factor (anomalous values 
may occur if there is no within-spouse general factor). The interpretation of the 
coefficient is quite straightforward. If the average amount of hostility in a 
marriage shows a systematic tendency to deviate from the general mean then the 
coefficient will be positive. If the scatter of marital aggression round the general 
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mean can be wholly explained by other sources of variance then the coefficient 
is zero. If the scatter is less than other sources of variance would lead us to 
expect then the coefficient is negative. 
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NOTES AND CORRESPON DENCE 


А FURTHER PROCEDURE FOR DETERMINING SLATER'S i AND ALL 
NEAREST ADJOINING ORDERS 


Ву Г.Р. N. Рниллрѕ 
Department of Psychology, University of Hull 


а 
1. INTRODUCTION 


Provably valid procedures for determining Slater's statistic ; (Slater, 1960, 1961), the 
minimum size of the set of preferences in a given pair-comparison preference table which 
are inconsistent with some ordering of the objects (the minimum being taken over all 
possible orderings), and all nearest adjoining orders which attain this minimum, have been 
given by Remage & Thompson (1966) and Phillips (1967). The writer recently inadver- 
tently blundered across the following procedure, which appears to offer advantages over 


both the preceding ones. 


2. GENERAL DESCRIPTION 


The procedure is based on the following very simple observation. 

Proposition 1. A nearest adjoining order a» 5» c» ... >n has the property that each 
immediately adjacent pair of objects in it is ordered as specified by the pair-comparison 
table. 

Proof. Let O be an order with the immediately adjacent pair of objects f> g, whilst 
the pair-comparison table contains the preference g>f: then O is not a nearest adjoining 
order, since an order with fewer inconsistencies is obtained by switching f and g. 

The procedure consists of carrying out a tree-search of all orders with the property 
of Proposition 1. However, at each node of the tree, a lower bound is obtained to the 
inconsistency of the order segment it represents: then, after the first ramification has been 
searched to the full, providing an upper bound to i, nodes whose lower bound exceeds this 
upper bound need not be investigated further: the upper bound may, moreover, be improved 
in the course of the procedure. : . 

Тһе procedure is made more efficient by prior use of a modified version of the Slater— 
Alway rules (Slater, 1961) to obtain a good first approximation to a nearest adjoining order, 
and also by the use of the further very simple observation. 

Proposition 2. Let $=а>Ь>с>... >) be any (possibly empty) initial segment of 
an order, and let R= (g, h, ..., n} be the set of objects not in S. If some member of R, 
say k, is preferred, in the pair-comparison table, to all other members of R, then any order 
in which S is immediately followed by А will have fewer inconsistencies than any order in 
which S is immediately followed by some other member of R. 

Proof. Only inconsistencies involving Ё and some other member of R will be affected, 


and they are uniquely minimised by placing & at the head of R, 


3. ILLUSTRATION 


The procedure will be illustrated by the artificial example shown in Table 14. 
Step 1 (Slater-Alway Rule 1). Simultaneously permute the rows and columns of the 


c : $ 
table so that the column sums (number of +’s in each column) are, as far as possible, in 


increasing order from left to right. "Тһе results are shown in Table 15. 


! Some readers may prefer to proceed straight to the illustration of the procedure in the 
following section and then refer back to its justification in the present one. 
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TABLE 1. ILLUSTRATION OF THE PROCEDURE: THE SLATER-ALWAY RULES 


(a) An inconsistent pair-comparison (b) The outcome of applying 


table Rule 1 to (a) 
(2123456 7 8 9 10 () 49 6105 37 21 8 
1 ------ ж-- 4 +++ + + + + + + 
2% ---%-%-- 9- س‎ + + + + + + + 
з++ ---++-- 6-+ -+++- + + 
4 +++ +++ + + + 10+ س‎ + + + + + 
5+++- --+-+ 5---+ +-++ + 
бееже% +++- 3----- +++ + 
7++--+- 4-- 7----+- +++ 
8------ - - - 2 س ت‎ eS ت ت‎ + + 
9 +++ + ++ +4 1-------- + 
10+ + + س س‎ + 44 - 8--------- 

8 6 0 443 5 9 2 3 0 2 3 3 4 5 5 6 8 9 


(c) The outcome of applying the modified Rule 2 to 
the inconsistent subtable contained in (b) 

(0106 9 5 

T - 

T 


nll 
І 
1 

ses |‏ ا + اص 
ә‏ ++ ++ 1 
+Ф+++ 1+0‏ 


445 


Note that the sum of the first column of the permuted table is 0, and the sums of the 
last two columns are 8 and 9: by Proposition 2, the permuted orders of these columns give 


unique initial and final segments for all nearest adjoining orders, and only the inconsistent 
subtable obtained by deletin 


T Д g these columns and the corresponding rows need be further 
investigated. 


(If the inconsistent subtable is of order 3 or 4, then the calculations are finished: if 
of order 3, then there is just one possibility, shown in its two manifestations in Table 2a, b, 
with the three nearest adjoining orders p>q>r, q»r >p> and r>p>q; if of order 4, 
then again there is just one possibility, shown in its four manifestations in Table 2c-f, with 


the nearest adjoining order p>q>r>s: in either case, i—1. However, this does not 
occur in the present example.) 


. | TABLE 2. Two SPECIAL Cases 
(if the inconsistent subtable o| 


btained b: lyi i 4, 
then therein eitis y applying Rule 1 is of order 3 or 


r case essentially only one solution.) 
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Step 2 (Slater-Alway Rule 2, modified?). Кот each cell of the upper right-hand half 
of the inconsistent subtable in turn, starting with the corner cell, then the two cells adjacent 
to it, then the diagonal set of three adjacent to these two, etc., examine the right-angled 
band of cells including it, and reaching leftwards and downwards to the diagonal If this 
band contains more —'s than 4-75, then switch the corresponding rows and columns. 
Continue the process as long as possible. In the present case this leads only to the switch 
of rows and columns 9 and 10, resulting in Table c. 
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Ficure 1. Illustration of the procedure: the tree search. For explanation, see text. 


Step 3. Tree-search. (All the calculations are shown in Fig. 1.) Consider in turn 
all objects as possible first members of a nearest adjoining order for the inconsistent sub- 
table. Take first object 10: it is seen from Table 1c that two objects are preferred to 10, 
so that any order beginning with 10 will have at least two inconsistencies: note this lower 
bound in brackets beside 10 in the tree. Then consider all possible immediate successors 
to 10 in the tree; by Proposition 1 and from Table 1c these are 6, 3, 7 and 2. Take first 6: 
from the table, two objects are preferred to 6, but one of them is 10, so that the choice of 6 
as next object adds only one to the lower bound, making it three, which is noted beside 6 
in the tree. Then consider all possible immediate successors to 6: these аге 9, 5, 3 and 7. 
Take first 9: only one object, namely 6, is preferred to it, so that it does not add anything 
to the lower bound, which is noted beside it: thus it is preferred to all the remaining objects, 
and is therefore, by Proposition 2, a unique optimal immediate successor to 6, so that 
5, 3 and 7 need not be further considered. Proceeding similarly, 5 is adjoined, adding one 
to the lower bound, and then 3, adding nothing. The two remaining objects, 7 and 2 
must occur in that order, and add nothing to the lower bound, which now serves as a 
preliminary estimate of, and upper bound to, 7: the tree segment constructed gives a 
tentative nearest adjoining order for the subtable 10>6>9>5>3>7>2. 


?'This modification is necessary, at least as far as computer implementation is concerned, 
because the original rule can lead to a closed loop. 
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Now move back up the branch to examine 3, 7 and 2 as immediate successors to 9. 
The first two each give another tentative nearest adjoining order; however, for 2 the lower 
bound reaches the value of six, exceeding the preliminary estimate of i, so that no successors 
to it need be further considered. 3, 7 and 2 are similarly rejected as immediate successors 
to 10. О | . 

Now that the tree starting with 10 is complete, each other object is considered in turn 
as the starting point of a tree. In the tree starting with 6, when the tentative nearest 
adjoining order 62925» 10232722 is reached, it is found to have an associated value 
of three as estimate of i; since this is less than the previous estimate, the latter, and all 
previous tentative nearest adjoining orders are scrapped. This tree, and the next two, are 
then fairly rapidly concluded, and it then transpires that 3, 7 and 2 need not be considered 
at all as first objects, since more than three objects are preferred to each of them. 

Thus, at the end of the whole search, the current estimate of i is three, which is therefore 


its true value, and the unique nearest adjoining order for the whole table is 
4>6>9>5>10>3>7>2>1>8. 


4. DISCUSSION 


Validity. Тһе validity of the procedure has already been informally proved in the 
course of describing it (relying on Propositions 1 and 2), and therefore needs no further 
demonstration here. 

Practicability, The procedure is even less practicable for hand computations than those 
of Remage & Thompson (1966) and Phillips (1967), being effectively unworkable by hand 
for more than about eight objects. It is, however, even better suited for computer 
implementation, and an ALGOL 60 procedure which executes it has been developed. 
Copies thereof may be obtained from the author. 

Relative efficiency. It is obvious that the present procedure (New Slater) shares the 
advantage enjoyed by the procedure (Slater) described by Phillips (1967) of being most 
efficient with data for which i and the number, j, of nearest adjoining orders is small. 
Indeed, with a large number of objects and highly consistent data, Slater, which works by 
eliminating circular triads, might be slightly more efficient than New Slater, with its 
initial row and column permutations. At the other extreme, Slater had the unfortunate 
feature that with certain pathological sets of data it could be enormously less efficient than 


even the brute force method of considering all 2! possible rank orderings of the objects in 
turn. New Slater does not suffer from this disadvantage (it computes at most 
n-cn(n—1)  n(n—1)(—2)- ...--n(n—1)(n—2) ... 43 = (e— 2)n!—1 to the nearest integer, 
lower bounds, and this is clearly in general a ridiculous overestimate), although with certain 


highly pathological matrices it may be somewhat less efficient than the algorithm ОЁ 
Remage & ''hompson, which requires 


n, 2-1 п 2 (3) 


computations. For example, with eight 9 x 9 tables for which i= 12, having between 45 
and 243: nearest adjoining orders, New Slater computed between 2705 and 2937 lower 
bounds, as against the Remage & Thompson algorithm’s 2223 “computations '. The 
latter, however, appears to be excluded from practical consideration by the fact that, for 
the determination of nearest adjoining orders, it requires simultaneous storage of the 
results of all its computations: for л > 15, this would exceed the entire fast storage capacity 
of Atlas, unless special integer packing procedures, which would increase the time taken, 
were employed. 

Exceptional cases of one kind or another apart, New Slater appears to be in general 
more efficient than Slater. Two sets of tests have been run to verify this, The first 


E 
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consisted of timed runs of all distinct (i.e. not derivable from one another by row and 
column permutations or reflection about the minor diagonal) tables for n=3, 4, 5 and 6 
on an Elliott 803 computer: to make the test fairly severe for New Slater, all patterns were 
arranged so that there were as many —’s in the top right-hand half of each table as possible. 
The results indicated that as z increased, New Slater, whether computing i alone, or i and 
all nearest adjoining orders (it made little difference which), was substantially faster than 
Slater computing i alone, and very considerably faster than Slater computing i and all 
nearest adjoining orders. "Тһе second set of tests consisted of timed runs of sixty-eight 
16x16 tables, obtained from thought-disordered schizophrenics, on an I.C.T. Atlas. 
New Slater took somewhat under two hours to compute i alone, and somewhat longer for i 
and all nearest adjoining orders, whilst in the same time Slater computing i alone completed 
only the first twenty-five, and computing i and all nearest adjoining orders completed only 
the first six and most of the seventh. 

'These results leave little doubt that for most practical purposes the present procedure 
is the more efficient, although even it is very time-consuming. 
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Information Theory of Choice-Reaction Times. Ву D. R. J. LAMING. London and 
New York: Academic Press. 1968. Pp. ix+172. 63s. 

| It is helpful to use the notion of information to describe the quantifiable aspects of 
variables on which both decision and response-making are based. For this purpose the 
definition of information usually involves the specification of hypotheses, each of which 
assigns a probability measure to an event. The information contained in the event for 
discrimination in favour of one hypothesis against another is then defined as the (average) 
logarithm of the ratio of the probabilities assigned by the two hypotheses. In this way 
modellers of a given stimulus-response situation can produce differing definitions of informa- 
tion by selecting differing events and/or hypotheses and thus emphasize differing aspects of 
the situation, Many readers, on encountering ‘information’ in the context of choice 
reactions, might expect a discussion based on the Shannon measure of information. But 
the information theory that is proposed, analysed and tested in this monograph selects as 
its basic event the magnitude of a brief sensory impression generated by the stimulus, and 
the hypotheses that the subject tests refer to whether or not a stimulus from a known set has 
been presented. Though the author adds that this sensory event need not have a psycho- 
logical interpretation and can be regarded as a merely hypothetical construct which facilitates 
the presentation of the theory, this superficial interpretation would seem to rob the theory of 
much of its appeal. The author links information and choice reaction time by assuming that 
reaction time is determined by the time taken to collect a quantity of information, the 
quantity being determined by the error rates which a subject is prepared to tolerate. Hence 
this information theory of choice reaction times focuses principally on error rates, and its 
most attractive feature is that it does indeed account for the rates obtained in the experiments 


reported. : y | 

Тһе fundamental assumption made by the author is that the subject obtains his sensory 
data by extracting information continuously from the stimulus display, and in such a way 
that successive increments of information are mutually independent. A continuous rather 


than a discrete process (analogous to a random walk) is adopted as the basis of the model in 
order to avoid two problems. The first concerns the length and interpretation of the time 
taken to make each observation in the discrete process, and the second is the ' excess over 
the boundary’ problem. This arises because, when the collected information first exceeds 
a prescribed quantity, it will do so by a finite amount, and this means that the reaction-time 
distributions derived from Wald's Identity (a technique borrowed from sequential sampling 
theory) are only approximate. However, the author shows that if the probability distribution 
of each observation is infinitely divisible, then a series of independent random variables can 
be constructed which is an arbitrarily close approximation to a continuous process, so that 
the use of Wald's Identity in this case would yield exact results. . D. 
Other assumptions specify that at the start of the continuous process the information in 
favour of a stimulus with prior probability p is log [2/01 — p that the information stream 
contains all that is relevant to the discrimination and nothing that is irrelevant; and that a 


response is made as soon as the information collected reaches the appropriate amount. Also 
ditions on these amounts so that no two of them can be 


stated are necessary and sufficient con! ‘ s : URS 5 
attained simultaneously by the stream of information. Finally, an optimization axiom 
states that in suitable circumstances subjects try to achieve the fastest total performance 


compatible with a given average error rate. "п | | | 
Тһе theory leads to three important predictions which are tested using hitherto 


unpublished data from seven experiments. The first specifies the relation between error 
scores and prior signal probabilities, and the experimental data conform well to this predic- 
tion. The second, also supported by the data, is that in a two-choice experiment, the signal 
which elicits the faster average reaction will have the smaller probability of error, and 
conversely, The third is that, for a given response, the distribution of reaction time is 
the same whether the response is correct or an error. This prediction, however, is not 
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supported by the data from any of the experiments. These show in fact that in the two- 
choice case, errors are always faster than the same response made correctly, while in multi- 
choice cases errors frequently take longer than correct responses. However, it is shown that 
if the model is modified to take account of the subject’s uncertainty as to when to start 
sampling from the stimulus display, then errors would be faster than correct responses in the 
two-choice case. 

A surprising omission from the list of tested predictions is that for the optimal decision 
process, average reaction time is linearly related to a function, f ( p, €), of the signal (p) and 
error (c) probabilities. A comparison between the values of this function and average 
reaction time in experiment 2 is made below, and the agreement appears to be good. 


Signal prob. p 0-750 | 0-625 0:500 0:375 0250 
Error prob. є 0-023 0026 0027 0029 0023 
РФ, © 1:315 1:436 1:473 1385 1315 


Average RT (вес) — 0:384 0:399 0:413 0-409 0-395 


The chapters stating and testing the information stream model are flanked by a very 
useful critique of the notion of information, a survey of other choice models and by an 
inventory of those features of the author’s data which contradict the other models. The 
last but one chapter is descriptive and presents a comprehensive analysis of the sequential 


effects present in the data. It also contains an attractive treatment of the trial by trial 
fluctuation of level of attention, in which attention is regarded as continuously variable, 
its fluctuations being expressed by a first-order autoregressive scheme. The monograph ends 
with three appendices, two of them proving lemmas and theorems about a diffusion-decision 
process, and the third spelling out the methods used to analyse the experimental data. 

Since some of the mathematics in the appendices is 


c jati ‘exploratory with a view to 
future experimental work’, it may be usefu 


T l to examine its relevance to psychologists. 
Some of it concerns the continuity of a random walk when the distribution of successive 


increments is stationary and infinitely divisible. It will be remembered that these 
questions arose because the author wanted to evade the problems encountered with the 
discrete random walk. But it m 


discri ay well be that the subject does collect information 
in discrete steps and, if so, questions of continuity would not seem to be the ones of 
interest to psychologists. What would be more interesting is whether it is possible to derive 
differing behavioural predictions when continuous rather than discrete sampling is assumed 
in the model. Secondly, as mentioned above, the simple model predicts that mean error 
times will equal mean correct response times, a prediction which is not supported by the 
d to modify this prediction by introducing temporal uncertainty 
assumes that the number of steps in a random walk contributed by irrelevant information 
(no) is small compared with the number of steps required to reach a boundary (т). It is 
then shown that the difference between error and correct response times is proportional 
to ло. However, the difference that is observed between these times is sometimes of thie 
same order as a response time, i.e. empirically по seems to be of the same order as 7, which 
violates the first assumption of the argument. This suggests that a major modification of the 
2. [ана if it is to handle this aspect of the data, and until this is done many of T 
oar of fhe apperidix cannot be expected to influence future experimental wer 
ч pete јен ад унф "P ee an empirical result which should be borne in mind is did 
tothe greatest for pene: 224 average error rate depends on the signal probability, p, E 
is independent of the а а ретро желе ая setting st the pon 
sould Ge caked m = eters of the random walk, so that an additional assu 
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ALTERNATIVE MODELS FOR INFORMATION PROCESSING: 
CONSTRUCTING NON-PARAMETRIC TESTS 


By Ewarr A. C. THOMAS 


University College London 


In recognition tasks where the stimuli vary along two or more dimensions, 
the question arises as to what strategy the subject uses in processing the informa- 
tion about the various stimulus attributes. ‘Three strategies are discussed: 
parallel, serial and integrative processing. It is shown that each of these three 
strategies implies a relation among the distribution functions of three observable 
random variables. ‘Che problem is then to decide on the basis of three samples, 
one from each distribution, which one of the relations is most likely. Non- 
parametric statistics are defined which can be used to construct test functions for 
this decision problem, and a suggestion is made concerning the design of experi- 
ments which attempt to discriminate between these strategies. 


1. INTRODUCTION 


Тһе issue of whether a subject in an experiment can process information 
about a presented stimulus serially or in parallel has been raised in many areas 
of psychological research (see, e.g., Neisser, 1967; Nickerson, 1967; Posner, 
1969). "Тһе interpretation of serial and parallel processing scems to depend on 
the experimental situation, and in this note we shall confine our attention to a 
relatively simple recognition task and discuss the relevance to this issue of some 
theoretical results reported by Thomas (1969), hereafter referred to as (T). 

Let us consider three experimental conditions in which a subject is asked to 
recognize stimuli which vary along two dimensions a and b; dy ав, ... and 
Dı, by, ... denoting the possible stimulus values. The subject is asked to press a 
“уез” key only if the presented stimulus falls under a previously defined category, 
and to press a ‘no’ key otherwise. ‘The three categories or sets of positive 
stimuli, each set defining an experimental condition, are, say, C,. stimuli having 
the value а; Съ, stimuli having the value 5; and С, stimuli having the values 
| «and, Itis supposed that in an experimental condition stimuli having both 
| positive and negative values are presented in a random sequenc 
be concerned only with the response times to those stimuli which require а ' , 
response in all three conditions, so that the experiment will furnish three s vt 
{хм}, {ху} and {хук } of response times of sizes п na and пу, respective] к es 
for fixed л (—1, 2 or 3), xu, Xno, sxy Хан, are the obtained No-res " bends 
from condition (Сл). It is further assumed that they are ji; iid T „Ponse times 

) ; independent Observa. 


tions of a random variable Ху with distributi i 
де) Л ution function (d.f.) Р(х). 


е, but here we will 


G 
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d‘ No’ as soon as he checks either 
hat it does not have the value Р}. ie 4 
he checking procedure in this \ 


In condition (С), the subject can respon 
that the stimulus does not have the value а}, ort 
Alternative assumptions can be made about t 
condition. | 


Parallel Processing 
subject processes the stimulus information along both 


‘The first is that the 


dimensions simultaneously and independently, and responds ‘ No’ as soon as А 
one positive dimension value is checked as absent. In this case X, would be |. 
the smaller of X, and X which implies 
Ho: Ках) = Ri) (х), Ж 
where Еһ=1—Ёһ. 1 
Serial Processing 
ж 


The second assumption is that the subject processes the stimulus informa- 
tion along one dimension only and responds “ Хо” as soon as the value on that 
dimension is determined. Let р be the probability that the information is 
processed along the a dimension, and q = 1 — the probability that it is processed we 
along the b dimension. Then, with probability p X; comes from the population 
with d.f. F(x), and with probability q from the population with d.f. (х). In 
other words, this assumption implies 


Ны F,(x)=pF,(x) +4(%). . 


Integrative Processing 


Тһе third assumption is that the stimulus information on the two-dimension 
is processed simultaneously and independently, but both processes ' output ' 
into a single integrator which triggers a response when it contains a prescribed 
amount of information. 

This last strategy is formulated more precisely in Section 3.1. In the first 
instance we shall consider possible statistical tests of Ho and Н,, and in the 
appendix we shall discuss briefly the case where the sitmuli vary on three 
dimensions. 


2. DEFINITION AND PROPERTIES OF "D EST STATISTICS 
Let Ху, X, and X, be the previously defined positive random variables 
with d.£'s Fı, Fẹ and F}, respectively, and, when the observed values ха are 
ranked in ascending order, let (й) (= 1,2, 3) be the rank of xn. We shall supposes 
for convenience, that Ж, is stochastically less than or equal (о Xs. In the 
special case where P(X, < X,) — 1, if Hy is true then Ху will have the d.f. F, and 
P(X; < X1) -1/2-Р(Х,< X, and X,), whereas if H, is true it can be verified 
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that P(X, < X, <X,)= 1/2 = P[((1), (2), (3)) is an odd Permutation of (1, 2, 3)]. 
This suggests that, in general, it would be useful to define two functions of the 
ranks (A) as follows: 


5[(1), (2), (3)] «0 if (3) =1, i.e. if x, is the smallest 
= 1 otherwise; 


° t[(1), (2), (3)] =0 if (1), (2), (3)) is an even Permutation of (1, 2, 3) 
=1 otherwise. 


s and ¢ are clearly not independent; in fact, 


st=1 if (1), (2), (3)) = (3, 2, 1) 
= 0) otherwise. 


It has been shown in (T) that if H, is true, Е()-1. If Ну is true then 
Р(з=0)= P(X, < Xi, Х,) = | Ri(*)Ro(x)d F(x) 
0 


1. (2.1) 


` This suggests defining U-statistics by looking at all possible triples 
(хи, Хау Хз). Let 


M 


^l 
= | (1- Е)4Е, 
0 


1 
* Wm 
= sas 00 00 GA) 
and 
1 
Man E 12), (27), (34)]. 
ias, Пай, (у), Gh] 
In (I) it has been shown that if the zs are all equal to л, under H, nite —4 
is asymptotically distributed as N(0, c?), where o? < 1/4, provided Fx) z Fx). 
Similarly, it can be shown that under Ho, ii (s* — 1) is asymptotically distributed 
as N(0, о), where o? < 1/4, provided F, and F, are not degenerate d.f.'s having 
all their probability at the same x-value. | a | 
Expressions for s* and /% are available which simplify their computation. 
Let 


qi(1) be the number of MSL Ayr (R—1,2,.., пу), 
qx(2) be the number of xs < Хур (Е--1,2,..., т)), 
77(1) be the number of xyis < Ху (51,2, ..., Ng), 
74(3) be the number of хуз < x; (1-4, 2, ny n). 
Then 
1l $1 м 1 15; 1 т 
"-ада IO + 300) 77 Y ang) 
te diras Neen Hyg y, 
and 
tme. оду 8,40 
My д ч Hla jc Ha үү TRIS 
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3. TEST PROCEDURE 
It is clear that Hy and Н, are mutually exclusive, for if they were both true 
we would have 
PR, +qR.=R,=R Ry. 
Therefore 
R,(p— Ry) = – Ro. 
The right-hand side of this equation is always negative but, since R,(x) is 
non- increasing, there exists a value x’ such that R,(x) <p for x» х. For these 
values of x the left-hand side of the equation would be positive, which is contra- 
dictory. Let us therefore consider the simplified decision problem of testing 
the hypothesis H, against the alternative H,. (With respect to a recognition 
task, neither of these hypotheses may be true and the ‘ realistic’ decision model 
is more complex.) 
If H, is true, then from eqn. (2.1), E(s*) 21. If Н, is false, then H; is true 


and 
1- B*) = PX< Xn X= | "RS )R)AF (9) 
-i-p [^ Fo Р, А) -a КОСЕ 
< 4. 
‘Therefore 


E(s*) > 1. 
Since, if H, is true, the variance of s* tends to 0 as п 00, the test procedure, 
Reject H, only if s*>1/2+ (a), 


would be consistent and asymptotically unbiased for a given confidence level a. 
In the special case where P(X, < X) — 1, if Hy is false, E(s*) «1 -14p, so that the 
power of the test would be a decreasing function of р. 

The use of ¢* as a test statistic is limited. In the particular case where 
f(x) =x e-* and /„(х) = (1/4) e724, E(t*) = 0-455 if H, is true. However, it is 
shown in (T) that, in general, Е(/%) is not necessarily different from 1/2 under 
H,. ‘Therefore, a test based on /% for testing the hypothesis H, against the 
alternative Ну would, in some cases, һауе no power. For this reason, the above 
test based on 5% would be better than one based on /%. 

If s* > 1/2 + 8(a), it does seem desirable to check that t* = 1/2 before accept- 
ing H,. Such а check of consistency is desirable because of the oversimplifica- 
tion of the decision problem, even though it is not independent of the first test 
inasmuch as 5% and /% are dependent random variables. The question stil 
remains as to the interpretation if s* values less than 1/2. Dr T. Shallice 
(personal communication) has suggested that such values could result if 
the processing in condition C, were ‘ hyperparallel’ in the sense that the 
information about the two-dimension values is integrated in some sense. In 
the next subsection we will state one model for integrative processing and will 
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Show that for this model E(s*) < 1/2 and, for a particular choice of F, and F,, 
E(t*) «1/2. Given these three possible models for the Processing in condition 
C, we would then have the following crude inference scheme based on the 
observed s* and 7% values: 5%-1, O«t*«1, infer parallel processing, s*> 1, 
1* =}, infer serial processing, s*<1, f*—1. infer some form of integrative 
processing. At present no interpretation can be offered of other combinations 
of s* and ¢* values, eg. s*> 4, 1* <1, except that they perhaps indicate that the 
mode of processing is a mixture of the three modes so far proposed (see Nickerson, 
1967). 


1. 


FIGURE 1. Partition of the unit square giving the acceptance regions for the three proposed 
models for the processing in condition Cs, assuming that P(X, SAX >}: (a) parallel, 
(0) serial and (с) integrative. Тһе possibility that the Processing is a mixture of 
these three is ignored here. 


In testing whether the 5% or /% value obtained from a single subject's data 
is significantly different from 1/2, it can reasonably be assumed that the statistic 
is normally distributed if п, the minimum number of readings in a condition, is 
large. "Тһе ‘ variance ' іт can be used to set conservative tolerance limits for 
E(s*) or E(t*). If data from а group of subjects are available, it would be appro- 
priate to use the Student t-test on the 5% (or 1%) values to decide whether the 
Broup mean differs significantly from 1/2. 


3.1. An Integration Model 


Let us suppose that, on a given trial in each of the three experimental 
conditions, the rates at which information is processed аге ry, ғ; and Ta, respec- 
tively. r, (й = 1, 2, 3) is regarded as a positive random variable, Let us 
Suppose that a response is made as soon as the amount of information processed 
is equal to unity. Then we can write 


Yh = Хр-1, 
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Тһе integration model states that 
fg — T4 Tg (3.1.1.) 
from which a relation involving d.f.’s is obtained, 


Ha: Риб) = [| Рү(х-у)4Ёу(у), (3.1.2) 
40 
where Fy’ is the d.f. of X,-1. Hy and Н, can also be written in terms of the 
Еһ”: 


Ну FyG)- Р(х), (х) 
and 


Ны Fy() m pFIG) + 4F а). 


It may be noted that one can define a statistic, analogous to s* and ?*, which 
is based on the result, 


P(X; t> Хү-ї+ Х,-%)=1[2 
if H, is true. : | 


If x, y «0, Ё,'(х— у) < F,'(x) since F,’ is а non-decreasing function. Using 
eqn. (3.1.2), we have 
т т 
Fé) = [`Рү(«—у4ЕУ(у) rio) [dF (= Fierro 
0 
Therefore, under Ho, 
1— E(s*) 


P(X; < Xy, X) =P(X 2 Ху, Xo) 


[RCRA > | Раву =1/2. 
" “0 


Therefore 
E(s*) <1/2. 

. We "wu restrict our attention to the case where r, is drawn from a population 
with p a (hed nsn function (p.d.f.) /. (х) = B ez, and ғу is the sum of two 
r.v.'s drawn from the same population as rẹ. Therefore the p.d.f. of r, is 

f(x) = Bex efa, 
and, under Н», that of ғ; is 

Р(х) = 49x e-821 
With these distributions we have 


E(s*) = 1- | "РЕ, ах = 0-363, 
70 


апа 
49 СА © 
E(t*) = [ Е} dx — | Рах + | Fy f dx = 0:438. 
JN JO “0 
1 "These p.d.f.'s of r, and ғ, correspond to p.d.f.'s of X, and X, given by 
fi) хз exp ( Blix) апа faG) — Bx exp (ffx), 
both of which are positively skewed and have ‘ high ' tails which decrease as a power of x. 


2 "They are 
therefore, not unreasonable models for reaction time distributions, 
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. This proves the statements made in the previous section. It is a conjecture 
that, for arbitrary F, and F, if P(X, < X;) > 1/2 and H, is true, then E(r*) < 1/2, 
but I am unable to prove it. 


4. AN EXPERIMENT 


As an illustration of the above test procedure, consider some of the unpub- 
lished results of an experiment. Six subjects were used and the stimulus 
dimension values were: 


a, = circle, a,=triangle, a,=cross, 

b,—orange, b,—red and b,= yellow. 
In all three conditions the probability of presenting a stimulus with a positivc 
value was 2/3; лу =n,=n,=16. An analysis of variance showed that the times 
from conditions (1) and (2) came from different populations, that is, that F, =F, 
subject to the conditions assumed in an analysis of variance, e.g. equality of 
variance. 

Тһе s* and ¢* values obtained from the six subjects were: 


Subjects 1 2 3 4 5 6 Mean 
ж 0:643 0:529 0:700 0-624 0:502 0:565 0:594 
i 0:478 0:466 0:535 0:514 0:559 0:499 0-509 


The Student ¢ value for the s*'s is 3-1, which is significant at the 5 per cent level, 
and that for the /% is 0-3, which is not significant. "Тһе inference to be drawn 
from these calculations is that subjects appear to process serially іп this particular 
task, but the point of the exercise has been to indicate that the test procedure 
outlined in Section 3 can sometimes be useful. 


5. CONCLUDING REMARKS 


This paper has been concerned with the information processes leading to the 
recognition of a stimulus, and in discussing statistical tests of the nature of the 
processes it has been assumed that recognition time can be equated to observed 
reaction time. However, it is more plausible that an observed response is 
comprised of a number of stages, of which stimulus recognition is but one, such 
that distinct stages are executed in distinct intervals of time (Sternberg, 1969). 
If this is the case, the random variables to which the hypotheses Ho, Ну and H, 
refer would be unobservable, and the suggested test procedure would be in- 
applicable unless further assumptions are made. If recognition time differs 
from observed reaction time by a constant amount, the functions s and t defined 
in Section 2 remain unchanged whether their arguments are recognition times 
or the corresponding observed reaction times. Therefore, under this assump- 
tion of constant difference, the statistical tests based on reaction time (Section 3) 


' This experiment was carried out at University College by Miss E. Saraga. 
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would have exact size a. It is suggested that these tests would have size = 
mately « if the variance of the difference between observed reaction tine 
recognition time is small relative to the variance of the recognition time. E. ең 
would probably be the case if the ratio of average recognition time to av а T 
reaction time is close to unity, and this requirement should perhaps influence 
choice of stimuli to be used in the experiment. 

The hypotheses H,, Н, and H, refer to the no-response times from cr 
ditions Cı, C, and Cg. In general, it is felt that comparisons should be m 
only among response times that are conditional on the same response. W see 
this restriction, it is clear that the statement of these hypotheses is the ame , 
instead, one were to consider the yes-response times to conditions Су, C; and кч 
condition in which the positive category is defined disjunctively: a, ог кА 
Also one can state similar hypotheses relating the distribution functions a: 
yes-response times іп C,, C, and Сз. Finally, it may be noted that if pues 
times along the two dimensions have the same distribution, i.e. if d rip 1 
then, under Н,, F,— F, so that this hypothesis is indistinguishable from 5 
hypothesis that ex -ry no-response time comes from the same population having 
distribution function F, independently of the stimulus attributes. For this 
reason it is desirable that an experimenter, in designing the task, should choose 


дч А : re 
ау and b, so as to ensure that the processing times along the two dimensions a 
not similar. 
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APPENDIX 
When the stimuli vary on three dimensi 
experimental conditions each defined b 
categories, (Cy) а; (С) by; (C3) c1; (С) a and b, and c, 
time from (Ch) with d.f. Р(х). Then we сап write 


Ho: Ré) Ry(x)R4(x)R,(x) 


; ; r 
ions, let us consider the fou 


- Let X, bea no-response 


and 
Ay: F,(x)=p,F (x) + Pek (x) +(1—p, — Po) F(x). 
The rank (Л) of X; is defined as in Section 2, and 
s[(1), ..., (91-0 if (4)=1 
=1 otherwise, 
(1), ..., (4)] 50 if ((1), ..., (4)) is an even permutation of (1, i$ 4) 
—] otherwise. 
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Clearly under Н, E(s) «1, and it is shown in (T) that E 
1% can be defined as before; here we give a fo 
be of more interest to psychologists than Hy 

Given two samples (ха), ..., Xanı) and (хь, Uo Vong) Where a, b= 1, аз 
let F(a, 8) be the number of x»’s which are greater than xam. Let 


(t)=3 under Hn. s* and 
rmula only for #* since Н, seems to 


1 u 5 
бар ———— У тыд, Б). 
“- 7080) 
Then 
P = (о, — 2a ag, + a34) — (235 таңа азу) 
+ (oy 2914095 + Cag). 
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ТНЕ EFFECT OF NUMBER OF RESPONSE ALTERNATIVES ON 
RESPONSE FREQUENCY AND LATENCY IN PAIRED-ASSOCIATE 
LEARNING 


By Клірн F. HALL 


Department of Psychology, University of Sydney, New South Wales, Australia 


Ап experiment was carried out to investigate the effect of the number of 
response alternatives on response latency in paired-associate learning. In 
previous studies no systematic differences have been found between error and 
success latencies of trials prior to the last error, but these studies have involved 
only two or three response alternatives. Іп the present study eight stimuli 
were paired with either two-, four- or eight-response alternatives. 

The mean error latencies were an increasing function of the number of 
response alternatives, while on pre-criterion trials, mean error latencies were 
consistently greater than mean success latencies with four- and eight-response 
alternatives. No systematic differences were found in the two-response con- 
dition. 

A four-state all-or-none memory model was fitted to both the frequency and 
latency data, and adequately accounted for the four- and eight-response €onditions, 
but yielded some discrepant predictions in the two-response condition. "These 
discrepancies were attributed to the use of different recall strategies by the 
subjects. 


1. INTRODUCTION 

Since Bower's (1961) analysis of paired-associate (PA) learning in terms 
of a two-state Markov model, a variety of three- and four-state models have been 
developed to account for departures from the predictions of the two-state model 
found in a number of experiments (e.g. Atkinson & Crothers, 1964; Bernbach, 
1965; Greeno, 1967; Restle, 1964; Suppes & Ginsberg, 1963). Experiments 
carried out for the purpose of testing these models have almost invariably involved 
fewer response alternatives than stimuli. This procedure involves the pairing of 
each response with more than one stimulus item and so differs from the usual 
procedure in which there is a 1-1 correspondence between stimuli and responses. 
This change in the experimental paradigm has passed virtually without comment. 
The purpose of using a small number of response alternatives has apparently 
been to eliminate such factors as response learning in order to study the associa- 
tion process in isolation (Theios, 1968). However, the use of highly familiar 
Verbal material such as numerals as responses, with the subject being informed 
of the response alternatives prior to the experiment, should ensure elimination 


ОЁ response learning. 
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The aim of the present research is to examine the effect of the size of ш ee 
of response alternatives on both response probability and response am in 
PA learning task. Тһе theoretical development 18 based „оп the qute 
memory model proposed by Atkinson & Crothers (1964). This mode is D дей 
on a distinction between short-term and long-term memory, and incorpora A 
a forgetting process. The four states of the model are U, F, S and L, WEEE 
is an uncoded state, S is a temporary or short-term memory state, F is a state 1 
which the connection between the encoded stimulus and the correct response has 
been forgotten, and L is a permanent or long-term memory state. 

Since the present research is concerned with response latency as Чч as 
response probability, one disadvantage of the Atkinson & Crothers mode Ne 
that it has been developed only for response probability. However, by ux 
addition of a further assumption, Suppes et al. (1966) and 'Theios (1965) have 
shown how Markov models for response probabilities can be easily extended к 

response latencies. The additional assumption needed is that for each state e 
the model there exists a probability distribution of response latencies, each 0 
which hasa finite mean. No specific assumptions have been made about the form 
of the distribution. Suppes et al. incorporated a latency mechanism of this kind 
in a three-state Markov model and compared predictions of the model with data 
obtained from two PA learning experiments, in which 12 nonsense syllables 
were paired with three response alternatives (key presses). While predictions 
concerning response frequencies were in close agreement with the data, marked 
discrepancies between predicted and obtained values of the mean latency over 
trials occurred іп one of the two experiments. Suppes е! al. attributed this and 
other discrepancies between observed and predicted latency statistics to a sharp 
increase in mean latency on the trial of the last error. А graph of the backward 
latency curve showed no systematic differences between success and error 
latencies on trials prior to the last error, but a considerable increase in mean 
latency on the trial of the last error followed by a steady decrease on following 
trials. In other studies in which response latency has been measured (Keller 
et al., 1967; Kintsch, 1965; Millward, 1964) no systematic differences between 
error and success latencies on trials prior to the last error have been found. 
Kintsch and Millward also found no evidence for substantial increases 10 
response latency on the trial of the last error. One procedural difference 
between these two experiments and the experiments by Suppes ef al. was that, 
in the former, two response alternatives were used, whereas three response? 
were used in the latter. Suppes et al. also present data from an experiment 
conducted by Estes & Horst, in which 24 common four-letter words were pairec 
with numbers, each stimulus having a unique response. Іп this experiment 4 
marked increase in mean latency on the trial of the last error occurred and 1n 
addition an increase in mean latency from two trials prior to the last error Wa? 
evident. ‘These results suggest that the number of response alternatives might 
be a relevant factor, not only in relation to mean latency on the trial of the last 
error, but also in relation to latencies on trials prior to the last error. 


—— 


e 
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2. THEORY 

While the three-state model provided a satisfactory fit to the response 
frequency data from the two experiments reported by Suppes et al., Atkinson & 
Crothers (1964) obtained a generally poor fit of the model in the eight experi- 
ments reported by them. However, in three of these eight experiments three 
response alternatives were used and the remaining five involved four responses. 
Тһе overall poor fit of the three-state model was due primarily to the experiments 
involving four response alternatives. Since in the present experiment the 
number of response alternatives is varied, the four-state model would seem to 
be the more adequate. It also provides a framework in which the response 
latency effects found by Suppes et al. can be interpreted. 


2.1. An Extension of the Four-State Model to Response Latencies 
Тһе transition matrix and response probability vector of the four-state 
memory model developed by Atkinson & Crothers (1964) is as follows: 


C; с, C, Сы Pr(Correct) 
С; 1 0 0 0 1 
с, а (1-а)1-/) (1—a)f 0 1 
C, a (1—a)y1—7) (1—a)f 0 g 
Co ca e(1—a)(1—f) c(1—a)f 1-с 2 


where the states U, F, S апа L have been relabelled Cy, Cy, C, and Су respec- 
tively for convenience in labelling latency statistics. The probability of a correct 
response, g, to items in states C, and C, is assumed to be equal to 1/r where there 
аге ғ response alternatives. The parameters of the model are a, f and c, where с 
is the probability that an item is encoded; a is the probability that an encoded 
item is stored in long-term memory; and f is the probability that an item stored 
in short-term memory is forgotten by the next presentation of that item. This 
model has been labelled the long-short (LS) model by Atkinson & Crothers. As 
well as the three-parameter case (LS3) they also consider the special two- 
parameter case (1,52), where с = 1, i.e. encoding occurs on the first trial. | 

"То extend the model to response latencies let t, denote the latency trial л. 
Then, following Suppes et al. (1966), it is assumed that for each state, C;, there 
exists a probability distribution of latencies with mean pi @=0, 93), 

Let tin = Pr(Ci л), i.e. win is the probability of an item being in state C; on 
trial п. Expressions for шш are given by Atkinson & Crothers (1964). They 
are (assuming с#а) 

ugs = (1 — c)", 
d 09) r - gi (1 - oe] for n»1, 


Uin = ———— 
1! с-а 

=0 for л=1, 
Ugn = —— Um, 


f 


Ugn = 1 — Ugn — Un — Чот. 
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Тһе mean latency on trial т is given by 


E(tn) = йолу + na + Honto + Unda (1) 


Let Xn be a random variable such that X, = 1 if an error occurs on trial л 
and X, — if a correct response occurs on trial п. ‘he expressions for the mean 
latency on trial л conditional on an error on trial 71, E(t5 Xn = 1), and conditional 
on a correct response on trial л, E(ty|Xn — 0), are then 


x ; и +u 
E(t,|X, = 1) = Ho m (2) 
tlon + Чуп 
and 


E(ty|Xn=0) = Шоп о + Una + Vane + Hanka (3) 
Uong + Ung + Uan + Ugn 


2.2. A Memory Search Interpretation 


Although in the formulation of the model detailed assumptions have been 
made about the structure of memory, little has been said about the nature of the 
retrieval process. In this section retrieval from short-term memory is viewed 
as a memory search process and some implications of the proposed search 
mechanism regarding relationships among the latency parameters are drawn. 

The idea of a memory search or scanning mechanism has been suggested by 
Eimas & Zeaman (1963), Millward (1964), Suppes et al. (1966) and Yntema & 
Trask (1963). In postulating a storage mechanism as an interpretation of the 
three-state model, Suppes et al. assumed that a subject has a list of stimuli and 
the correct response to each stored in his memory, and that while the list is 
incomplete he proceeds to search the list to find the correct response to the item 
presented. Eimas & Zeaman (1963) make the assumption that the subject scans 
the response alternatives until he finds a match with traces of the correct response 
stored in memory. They argued that the latency of a correct response should be 
shorter than that of an incorrect response, since in the latter case the subject 
scans all the response alternatives without finding a match. This scanning 
mechanism differs from the one proposed by Suppes et al. in that the scanning 
operation is assumed to be carried out over the list of response alternatives rather 
than over the list of stimuli and associated responses. 

"Го relate a search mechanism of this kind to the four-state model outlined 
in the preceding section, it is assumed that the scanning operation is only initiated 
when the item is stored in short-term memory (state S) or has been forgotten 
(state F). It is also assumed that items not yet encoded (state U) are guessed, 
and that items stored in long-term memory (state L) can be retrieved directly 
without the operation ofa search process. ‘This assumption postulates a different 
mode of operation of the retrieval process for short-term and for long-ter™ 
memory and is similar to that suggested by 'l'heios (1965). 


— 


ы. 


f— 
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Response latency can be related to the search mechanism by the assumption 
that latency is а linear function of scan time per item. If the subject scans the 
response alternatives according to a sampling without replacement scheme, as 
Eimas & Zeaman imply, then the mean of the latency distribution for the 
forgetting state will be greater than the mean of the latency distribution for the 
Short-term memory state, i.e. Lı > ka. This is so, since the correct response to 
an item stored in short-term memory can be retrieved, so that, on the average, 
fewer than the total number of responses need to be scanned. The correct 
response to an item which has been forgotten cannot be retrieved, so that all 
responses are scanned, but as a match is not found the subject guesses randomly 
from among the response alternatives. "Тһе means of the latency distributions in 
both the initial and long-term memory states should be less than Бә» Since the 
scanning mechanism is assumed not to operate in these states. 

Some implications of the search mechanism for the effect of the size of the 
set of response alternatives are apparent. As the number of response alter- 
natives, r, increases then м and p, will increase as there are more responses to 
scan. If the scanning operation is carried out over the stimuli and associated 
responses as suggested by Suppes et al. (1966), then such differences would not 
be predicted. 

In the preceding discussion only the direction of the differences among the 
means of the latency distributions in each state has been predicted. This 
alternative has been adopted in preference to attempting a precise formulation 
at this stage. Confirmation of the directional predictions would warrant a 
consideration of more detailed assumptions about the nature of the search 
process. 


3. EXPERIMENT 


Subjects. 'The subjects were 30 first-year undergraduate students in psychology at the 
University of Sydney, who participated in the experiment in order to complete part of 
the course requirement. . 

Apparatus. "Тһе subjects were seated in a 4x4x 10 ft. sound-reduced room facing 
two adjacent digital display screens, 4:35 in. high and 3:35 in. wide, visible through a small 
window in the room, The screens were located about 1 ft. above eye level at a distance of 
2 ft. from the seated subject. Stimulus and response items were presented as illuminations 
оп the screens and the order of presentation of pairs was determined by the experimenter 
manually from outside the room. The sequence of events and time intervals were con- 
trolled by electronic timers. ‘The subjects gave verbal responses and to enable recording 
of response latencies a microphone was situated to the right of the seated subject. Activa- 
tion of a voice key through the microphone stopped a chronoscope, turned off the stimulus 
item, and turned ‘on the correct response item. A warning light to indicate the onset of a 
stimulus item was situated in the room to the right of the display screens. 

Procedure. ‘The anticipation method of PA learning was used. Each stimulus-response 
Presentation consisted of illumination of the warning light for 2 sec., followed immediately 
by presentation of the stimulus item, which appeared in the left-hand screcn and remained 
Visible for a maximum of 3 вес. Ав soon as the subject responded, the stimulus item was 
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turned off and the correct response item appeared in the right-hand screen and — 

visible for 1 sec. "Тһе interval between the onset of the warning light from one presentati 

to the next was fixed at 12 sec. ۱ | 
'The subjects were read the following instructions: 


* You are to learn to say numbers 
in response to letters. 


Each time a letter appears іп the left-hand screen 1 want you to e 
one of the numbers . . . [experimenter informed subject which numbers were to be ү Є 
responses] as quickly as you can. It is important that you use only these бұлар ан 
responses and that you give a response every time a letter appears, even if it is only a ut^ s : 
For each letter which appears, one of these numbers will be the correct response. | id 
will be able to learn which number goes with each letter because as soon as you give а 
response the correct number will appear in the right-hand screen. iM 

So that you will be ready to respond, the warning light will come on and the letter wi 
appear as soon as the light goes off. : 

Remember to respond as quickly as you can, even after you have learned, as the experi- 
ment will continue for some time after you have learned the correct responses. : 

Eight letters were used as stimuli for all subjects and the experiment continued until 
20 presentations of each of the eight stimuli had been given. E 

Design. The main variable of interest was the number of response alternatives. Гһе 
eight letters were paired with either two, four or eight numbers as responses. For the 
two-response condition the responses were the numbers 6 and 7; for the four-responsc 
condition they were the numbers 5, 6, 7 and 8; and for the eight-response condition they 
were the numbers 2 to 9. 

In order to control for possible individual differences in response latencies a repeated- 
measurement design was used so that each subject was tested with each number of response 
alternatives. For this purpose three lists of letters were made up from the letters of the 
alphabet excluding О and I. Хо set contained any letters in common and the letters in 
each list were spread throughout the alphabet. The three lists were as follows: List A: 
B, W,Q,G,Z,F,J,N. ListB:D,U,S,Y,K,P,N,A. LisC:V,L,X, R, C, E, Н, Т. 

Тһе subjects were assigned randomly to three groups, each group being given a different 
lists-response alternatives combination in accordance with a Greco-Latin square (Winer, 
1962, p. 546). There were ten subjects in each group. ‘This design enables latency 
comparisons among the two-, four- and eight-response conditions to be made for the same 
subjects. 

"The order of presentation of stimuli was randomized within each block of eight items, 
with each item being presented once in each block. ‘This was done separately for each 
condition of the design. Responses were assigned randomly to the stimuli but with the 
restriction that for the early letters of the alphabet the number corresponding to the 
position of the letter (c.g. B2, C3) was not paired with the letter. ‘This was done to avoid 
obvious associations and to keep the lists as uniform in difficulty as possible. 


4. RESULTS 
4.1. Frequency Data 


For the purpose of analysis of data, a learning criterion of three errorless 
runs through the list was used. Occasional subsequent errors were not treate 
as such, provided no more than one subsequent error was made by a subject on à 
given item. Subjects who had not reached this criterion were eliminated and 


replaced. Only four subjects were eliminated from the analysis by this restric- 
tion. Most of the subjects made no errors after trial 10. 


чи 
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An analysis of variance was performed on number of trials to criterion. It 
may be noted that this measure of performance should be largely unaffected by 
the different guessing probabilities, unlike measures such as number of errors. 
The results of the analysis are reported in Table 1. The only effect significant at 
the 0-01 level is the number of response alternatives. The mean trials to 
criterion for the two-, four- and eight-response conditions were respectively 
9-03, 12-17 and 11-27, indicating that learning occurred more rapidly in the two- 
response condition than in the four- or eight-response conditions. Since no 
other effects were significant the data were combined for each number of response 
alternatives. Thus the following latency comparisons among these conditions 
are for the same subjects. 


TABLE 1. ANALYSIS OF VARIANCE OF TRIALS TO CRITERION 


Source D.F. S.S. M.S. F 
Between subjects: 
Groups 2 22:9 11:5 0:74 
Error between 27 418-7 15:5 
Within subjects: 
| Sessions 2 32-1 16-1 2:78 
Lists 2 61 34 0:53 
" No. of responses 2 1226 61:3 10:57* 
Error within 54 315-3 5-8 
Total 89 917:7 
» * Significant at 0:01 level. 


j TABLE 2. PARAMETER ESTIMATES AND MINIMUM ү? VALUES FOR THREE- AND 
| Foun-srATE MEMORY MODELS 


, Experimental condition 


aS SSS 

Model Parameter Two Four Eight 
responses responses responses 
LS3 a 0-262 0:215 0:240 
7 0:748 0-328 0:326 
Е 1:000 0:357 0:432 
x’ 40:43 19:24 29-85 
a 
7 
X 


LS2 0:262 0:179 0:210 
0:748 0:765 0:717 
2 40-43 59-16* 92-46% 


* Significant at 0:01 level. 


| 
| 
) 

То obtain estimates of the parameters a, f and c of the four-state model the 
method of minimum y? was used. ‘The procedure followed was that described 
by Calfee & Atkinson (1965). ‘This involved finding the parameter values which 
minimize the sum of the y? values computed from the observed and predicted 

| four-tuple response sequences (i.e. sequences of correct and incorrect responses) 
S.P. э» «Ы 
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on trials 2-5 and on trials 6-9. Тһе predicted probabilities of these sequences 
are given by Atkinson & Crothers (1964). А CDC3200 computer was pro- 
grammed to carry out the minimization procedure. "Гһе estimates of the 
parameters obtained and the y? values for both the two- and three-parameter 
versions of the model are shown in Table 2. The degrees of freedom for x are 
27 for the three-parameter version (LS3) and 28 for the two-parameter version 
(1,52). 

The xy? values obtained indicate an adequate fit of the three-parameter 
model to the data from all three conditions. The fit of the two-parameter 
model, however, is satisfactory only for the two-response condition. T'he 
minimum y? value for the two-response condition is the same for the two- 
parameter and three-parameter model since the estimate of c is 1. In this case 
the model reduces to a three-state model after the first trial since all items are 
then in,states F, S or L. Atkinson & Crothers (1964) and Greeno & Scandura 
(1966) have shown that this model predicts that the proportion of correct 
responses on trials after trial 1 and before the last error is constant. ‘I'he two- 
state model (Bower, 1961) predicts that this proportion is constant over all trials 
prior to the last error. Suppes & Ginsberg (1963) have given a test of this 
hypothesis for the two-state model which can be modified for the LS2 model by 
excluding trial 1 from the analysis. Since the two-state model has not been 
considered in the present analysis both tests have been carried out, together with 
the test of independence of successive responses prior to the last error. ‘The 
results of these tests are reported in Table 3. All values are significant at the 


jo except the stationarity test excluding trial 1 and the independence test 
т the two-response condition. This s t : icabili : L82 
model to this condition, upports the applicability of the 


TABLE 3. SrATIONARITY AND INDEPENDENCE TESTS 


2 


x 
А 
D.F Two Four Eight 
Stationarity ne responses responses responses 
Including trial 1 6 28-30* zi 
Excluding trial 1 5 " © ач 2% mus 
Independence 1 3:94 9:07% 23:10* 


* Significant at 0-01 level. 


Тһе three-parameter 1,53 model does not 
correct response on trials prior to the last ег 
probability їз given by 1--Д1- 2), By substituting the estimates of / fro?" 
Table 2 into this expression for the four- and eight-response conditions, the 
obtained values of the upper bounds are 0-754 and 0-715 respectively. In neithe" 


сопан did the maximum value of the proportion of correct responses on trials 
prior to the last error exceed the upper bound 


predict a constant probability of 2 
ror, but an upper bound for thi 
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Values of the parameters a, f and c for the four- and eight-response 
conditions are about the same, so that differences between these two conditions 
should be due primarily to the different guessing probabilities. 'ТҺе learning 
curves for the three conditions are shown in Fig. 1. Тһе predicted probability 
of a correct response on trial л, Pr(c;), is obtained from the equation 

Pr(cn) = g(uon + үп) + on +Ugn- 

Тһе predicted values are also shown in Fig. 1. In each condition the obtained 
values on trial 1 exceed the theoretical guessing probability. In general, the 
agreement between observed and predicted values is quite satisfactory, although 
there is a slight tendency towards underestimating the obtained proportions on 
trials 5-8. Тһе obtained curve for the eight-response condition crosses the 
curve for the four-response condition at trial 5, whereas the theoretical curves 
cross at trial 4. The crossing of the theoretical curves is due to the estimate of c 
being slightly larger for the eight-response group. 


PROPORTION OF CORRECT RESPONSES 
LJ 


5 10 5 20 
TRIALS 


FIGURE 1. Observed and predicted proportion of correct responses over trials for two- 
response alternatives (6), observed; ----- , predicted), four-response alternatives 
(B, observed; ...... , predicted) and eight-response alternatives (A, observed; 


‚ predicted). 


Expressions for the distribution of the trial number of the last error and 
mean total errors per item have also been derived by Atkinson & Crothers (1964). 
Goodness of fit of the predicted distribution of the trial number of the last error 
was tested by means of the Kolmogorov-Smirnov one-sample test. In no case 
did the maximum difference exceed the tabulated value of d at the 0-01 level. 
It can thus be concluded from the results presented that the model provides an 
adequate account of the response frequency data. 
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4.2. Latency Data 


'The mean latencies of successes and errors over trials for the two-, four- 
and eight-response conditions are shown in Fig. 2. Error latencies on trials 
beyond those shown were omitted as being based on too few observations. Іп 
all three conditions the success latencies show an initial increase followed by a 
steady decline to what appears to be an asymptote at about trial 18. ‘The three 
curves are almost identical in shape and reach about the same peak, but level ofl 
at different asymptotes, the two-response curve at about 1:02 sec., the four- 
response curve at about 1-14 sec., and the eight-response curve at about 1:06 sec. 
The mean error latencies, however, show marked differences. Іп the two- 
response condition the success and error latencies show only a slight tendency 
to separate, whereas in the four- and eight-response conditions the separation 1s 
marked. Furthermore, the error latencies in the cight-response condition rise 
more steeply than those of the four-response condition. For example, on trial 5 
the mean error latencies for the two-, four- and cight-response conditions are 
respectively 1-39, 1-68 and 1-88 sec. 


LATENCY IN 


MEAN 


10 5 
TRIALS 


20 


FIGURE 2. Mean latencies of successes and errors over trial 


(Ф), four-response alternatives (Е) and 
, SUCCESSES; -----, errors. 


s for two-response alternatives 
eight-response alternatives ( 


To investigate these differences further, backward latency curves were 
drawn in the way described by Suppes et al. (1966) and Millward (1964). All 
item sequences with the last error occurring on the same trial were groupe“ 
together and the subgroups aligned, so that the trial of the last error coincides 
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for all subgroups. Mean latencies were then computed across the subgroups 
and the resulting curves for the two-, four- and eight-response conditions are 
shown in Fig. 3. In these curves 0 stands for the last error regardless of the 
trial on which it occurred, — 1 stands for one trial before the last error, and so on. 
Sequences containing no errors or on which the last error occurred on trial 1 
are excluded from the graph. Errors on trial 1 are simply incorrect guesses, 
as the items had not previously been presented. 


LATENCY IN 


MEAN 


-3 


2 - 0 ! 2 3. «4 
DISTANCE FROM LAST ERROR 


Ficure 3. Backward latency curves for two-response alternatives (@), four-response 
alternatives ( ) and eight-response alternatives (А). ———, successes; -----, , errors, 


Trends similar to those obtained in Fig. 2 can be seen. "There appear to 
be no systematic differences between mean latencies of successes and errors on 
pre-criterion trials in the two-response condition. In the four- and eight- 
response conditions, however, the error and success latency curves show a 
tendency to separate, again with the greater separation occurring in the eight- 
response condition. Тһе drop in mean latency from the trial of the last error 
to the following trial increases as the number of response alternatives increases, 
the differences being 0-03, 0:21 and 0-29 sec., for the two-, four- and eight- 
тезропз iti respectively. 

қ г. ر‎ ш, of the latency model the method of least 
Squares was used, since eqns. (1), (2) and (3) are linear in the ш. Ы ыы the 
Parameters а, f and с used in these equations were the minimum X estimates 
given in Table 2. Тһе least-squares procedure involved specifying eqn. (1) 
for each of the 20 trials and eqns. (2) and (3) for trials on which there were a 
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TABLE 4. ESTIMATES OF THE LATENCY PARAMETERS 


Experimental condition 


(M ر‎ 


Parameter Two Four Eight 
и responses responses responses 
Ho = 1:145 1-209 
m 1:267 2:133 2:329 
ра 1:525 1:486 1:507 
Bs 1:083 1-124 1:093 


sufficient number of errors (at least 10 per cent of the total number of responses) 
to base a mean, and solving for the least squares estimates of the four parameters. 
The values obtained are shown in Table 4. 

In the four- and eight-response conditions the order of magnitude of the 
estimates of the means is the same, i.e. u, > pa» gp. This, however, is not 
the case for the two-response condition. Since the model for this condition 
reduced to a three-state model after trial 1, only three latency parameters were 
estimated. The estimate of му would, of course, be just the obtained mean 
latency on trial 1. Тһе order of magnitude of the estimates of p, and м have 
been reversed in the two-response condition. Consequently шу is the only 
parameter which is a monotonic increasing function of the number of response 
alternatives. If attention is confined just to the four- and eight-response 


conditions, then ji, м and м; are each larger for the eight-response condition 
than for the four-response condition. 


IN SEC. 


LATENCY 
5 


MEAN 


8 10 15 2n 
TRIALS 
Ficure 4. Observed and predicted mean latency curves for two-response alternatives (ө), 


ект alternatives (ІШ) and eight-response alternatives (A). ——^ data’ 
----- theory. 
, 


5 


= 22А سی‎ 


po^ 
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The estimates of the latency parameters in Table 4 were substituted in 
eqn. (1) to predict the mean latency over trials. Тһе obtained and predicted 
curves for the two-, four- and eight-response conditions are shown in Fig. 4. 
Тһе theoretical curves for the four- and eight-response conditions fit the 
observed values well. In each case the shape of the curve is accurately predicted. 
For the two-response condition the fit is poor. Тһе shape of the curve is not 
adequately described, and the predicted values consistently underestimate the 
obtained values on early trials and overestimate them on later trials. 

The latency parameters were also substituted in eqns. (2) and (3) to predict 
the mean latencies of successes and errors as a function of trials. The obtained 
and predicted values are shown in Table 5. Again the fit of the model to the 
four- and eight-response conditions appears to be satisfactory, although there is a 
tendency toward underestimation of the mean error latencies on early trials in 
the four-response condition. Since there is only one state in which errors can 
occur after trial 2 for the two-response condition, the mean error latencies over 
trials is predicted to be a constant. The obtained curve departs only slightly 
from this. However, the mean success latencies lie above the predicted values 
for trials 3-6. 


TABLE 5, OBSERVED AND PREDICTED MEAN LATENCIES (IN SEC.) OF ERRORS 
AND SUCCESSES 


"Trials .. . 1 2 3 4 5 6 7 
Two responses 
Successes 
Observed 0-94 1-15 1:30 121 1:20 117 
Predicted — 1:27 1:21 1:17 114 1413 
Errors 
Observed 1:01 1:19 1-23 1-32 1:39 1:21 
Predicted — 1:27 1:27 127 1:27 1:27 
Four responses 
Successes 
Obsesi 1:09 121 1:35 1:30 1:36 1:27 1:29 
Predicted 1:15 124 1:38 1:33 1:30 127 125 
Errors 
Observed 1:10 1:33 1:50 1:57 1:68 1:64 1:56 
Predicted 115 1:27 1:38 1:49 1:58 1:66 174 
Eight responses 
Successe. 
Саты 1:06 1:27 1:36 1-28 1:30 124 1:23 
Predicted 1-21 1:38 1:34 1:30 1:26 122 119 
Errors 
Observed 1-21 1:53 1:64 1:67 1:88 1:81 2-02 


Predicted ji 1:39 1555 1:70 1:83 1:93 2-02 
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5. DISCUSSION "m 
Тһе results of the analyses of the latency data show that the use of different 
numbers of response alternatives produces marked effects on error eda 
When backward latency curves are constructed so that the last error trial coinct in 
for each item it can be seen that the difference between the curves of error вше 
success latencies on pre-criterion trials is ап increasing function of the number 0 
response alternatives. Хо differences are apparent with two response ae 
natives as other investigators have reported, and with three response alternativ Е 
Suppes et al. (1966) found that the difference occurs only оп the trial of the last 
error. Since a fixed number of stimuli were used in the present study the 
possibility that these differences are due to the number of stimuli paired with 2 
given response cannot be ruled out. However, іп the two-response уе 
four stimuli were paired with each response, as was the case in the Suppes et at. 
study in which 12 stimuli were paired with three responses. Тһе sharp increase 
in mean latency on the trial of the last error found by Suppes et al. did not 
occur in the two-response condition, which suggests that the effect is due rather 
to the number of response alternatives. 

Тһе four-state model considered here satisfactorily accounted for both the 
frequency and latency data from the four- and eight-response conditions. Тһе 
similarity between the estimates of the parameters а, f and с for the two con^ 
ditions indicates that differences in proportions of correct responses over trials 
are due to the different guessing probabilities. The values of the parameter 
estimates for the two-response condition, on the other hand, are quite different 
indicating that the storage and retrieval process for this condition does not 
operate in the same way as in the four- and eight-response conditions. Restle 
(1965) noted that in a PA task involving two response alternatives, most subjects 
reported learning all stimuli paired with one of the responses and giving the 
other response to the remaining stimuli. This is a highly efficient learning 
strategy, since it reduces the learning task to learning only half of the stimuli. 
The different parameter values obtained in the two-response condition of the 
present experiment could be due to a substantial proportion of the subjects 
adopting this strategy. Furthermore, the relatively poor fit of the four-state 
model to the data from the two-response condition could be due to some subjects 
using this strategy and others learning each item separately. "Тһе likely effect 
of this would be that parameter values for the two groups of subjects differ 
considerably, so that the obtained values are a composite, with the resulting poor 
fit of the model reflecting this state of affairs. 

Тһе order relationships holding among the estimates of the latency ar 
meters for the four- and eight-response conditions are as predicted from L d 
memory-search mechanism outlined earlier. This does not, of course, demo, 
strate the validity of the assumptions made, but does suggest that a more prec n 
formulation of the process is worthy of consideration. Such a formulatie 
might be aimed at relating the mean of the latency distribution in each state 
scan time per item. It is unlikely, however, that such a relationship WO 
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turn out to be just a simply linear function, since even after 20 trials there are 
still clear differences in mean response latency among the three conditions. 
This suggests that the assumption that a memory search is not initiated for items 
stored in long-term memory is incorrect. However, the higher mean latency 
for the four-response condition found on post-criterion trials indicates that a 
search of long-term memory is not organized simply on the basis of number of 
response alternatives. 

One of the main differences between the four-state memory model con- 
sidered here and the three-state model used by Suppes et al. (1966) is that the 
single intermediate state of the three-state model has been replaced by two inter- 
mediate states, which have been identified as a short-term memory state and a 
forgetting state. Apart from the better fit of the four-state model found by 
Atkinson & Crothers (1964), the large differences obtained between the esti- 
mates of the means of the latency distributions in these states indicate that the 
assumption of a single latency distribution for this stage of learning is inadequate. 
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ТНЕ GENETIC ANALYSIS OF CONTINUOUS VARIATION: 
А COMPARISON OF EXPERIMENTAL DESIGNS APPLICABLE TO 
HUMAN DATA 


By L. J. Eaves 


Genetics Department, University of Birmingham 


‘The relative efficiencies of various experimental designs are compared for 
the genetical analysis of human behaviour. Particular attention is paid to the 
ability of the different designs to separate additive genetic variation from domin- 
ance variation. The utility of twins is investigated and it is shown that the 
relative number of twins need not be large for the statistical efficiency of the 
experiment to be optimal. "Тһе possibility of using foster-siblings in the place 
of monozygotic twins reared apart is considered. It is demonstrated that a basic 
and efficient genetical analysis may be undertaken without recourse to monozygotic 
twins reared apart. ‘The consequences for efficiency of limited availability of 
certain groups are investigated, and the method of the study illustrated with the 
aid of a worked example. 


1. INTRODUCTION 


This investigation was occasioned by the need to find a suitable alternative 
to twins for psychogenetical studies. A severe limitation of twin studies is the 
problem of ascertainment, i.e. the effort and expense involved in locating, 
diagnosing and possibly even transporting monozygotic twins. It would 
therefore be advantageous if an alternative could be found which in no way 
prejudices the efficiency of the statistical estimation procedure involved in the 
analysis of the characters measured. 

An experiment to study the causes of continuous variation should not only 
enable the variation in a population to be partitioned into environmental and 
genetic causes but it should also provide information on the type of gene action 
involved and permit a statistical test of the adequacy of the assumptions under- 
lying the genetical model used. . 

The usual twin study employs monozygotic twins reared together and 
dizygotic twins reared together. Intraclass correlations are calculated for both 
groups of twins from which an index of heritability can be obtained (Holzinger, 
1929). The ratio of the within-pair variance of dizygotic twins to that for 
monozygotic twins is used as an F-test of the importance of genetic factors in 
determining variation. This is the method employed in the Michigan Twin 
Study by Vandenburg (1962). References have been made to the inadequacy 
of this design as a source of genetic information, for example, by Gottesman 
(1966). A more precise study of its limitations is provided by Jinks & Fulker 
(1969), who recognize two primary deficiencies of the classical twin study: its 
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failure to take account of variation due to genetic and environmental influences 
acting between families; and the fact that it can provide no conclusive information 
about the type of gene action involved. 

The MAVA approach of Cattell (1960) recognizes the possible importance 
of between-family components of variation but does not distinguish additive and 
dominance components of genetic variation. "Тһе МАУА equations also incor- 
porate too many parameters ever to allow a test of goodness of fit of the model. 

` The approach of biometrical genetics developed by Mather (1949) provides 
the basis for a genetically meaningful evaluation of the principal components 
of variation which is readily extended to the analysis of continuous variation in 
a human population. Тһе application of biometrical genetics to human 
psychogenetics is the subject of a detailed study by Jinks & Fulker (1969) 


2. THE BroMETRICAL MODEL AND ITS LEAST SQUARES SOLUTION 


For a sample of z pairs of individuals a simple analysis of variance of within- 
and between-pairs differences may be obtained: 


Item Degrees of freedom Expectation of mean squares 
Between pairs n—1 op? d 
twe ; ті--2ор 
Within pairs n 


ay? 


When the familial relationship between members of a pair is the same for 
all the pairs in the sample, o? and c,? may be shown to have expectations in 
terms of four main components of variation, providing that there is no significant 
variation ‘due either to genotype-environment interaction or to correlation 
between genotype and environment. The four main components may be defined 
as follows: (1) additive heritable variation (О,); (2) non-additive, dominance, 
heritable variation (Hr); (3) variation due to environmental differences within 
pairs (Ej); (4) variation due to environmental influences between pairs (Ёз): 
Mather describes the method for the derivation of the expectations for the o?'s 
in terms of the parameters of the model, and gives the definitions of Dy and Hy 
in terms of additive and dominance effects of the genes and gene frequencies in 
the population. 

Combining the o?s with their coefficients in the an 
the expectations of mean squares for the items 
main components of variation, 
for the ten second-degree statist 
relevant to this investigation. 
mentioned, these expectations a 
random and that the env 


alysis of variance gives 
of the analysis in terms of the four 
Table 1 gives the expectations of mean squares 
ics obtainable for the five familial relationships 
In addition to the two assumptions already 
Iso assume that the population is mating at 


€ vironmental components are the same for all pairs 
regardless of the relationship between the individual members of a pair. 


Jinks & Fulker give a more extended discussion of the assumptions under- 
lying the model and suggest tests for genotype-environment interaction ап 
genotype-environment correlation. The randomness of mating and the equality 


ёз 


өз” e 
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Coefficients of Parameters 


Mean square 


Genetic Environmental 
ي‎ m 
Dn Hg E, Е, 
Monozygotic twins reared together 
Between pairs 1:0 05 1:0 2:0 
Within pairs 0-0 0-0 1:0 0-0 
Monozygotic twins reared apart 
Between pairs 1:0 05 1:0 10 
Within pairs 0-0 0:0 1:0 1:0 
Full-siblings reared together 
Detween pairs 0-75 0-3125 1:0 20 
Within pairs 0:25 0:1875 10 0-0 
Full-siblings reared apart 
Between pairs 0-75 0:3125 10 10 
Within pairs 0-25 0:1875 1:0 1:0 
Half-siblings reared together 
Between pairs 0-625 0:25 1:0 20 
Within pairs 0:375 0:25 1:0 00 


of environmental components for twins, siblings and half-siblings are assessed 
after the analysis by the test of goodness of fit of the model, 
Two estimates of heritability may be obtained after the four parameters have 
Сеп estimated. These are: the narrow heritability, 2,7, which is the proportion 
of the total variation due to additive heritable variation, and the broad heritability, 
tw”, which is the proportion of the total variation due to additive and non- 
additive heritable variation. 
The total variation, сд?, is defined as ор + c? and can be calculated from 
= 1D, + ФАН, Е-Е, The two estimates of heritability are given by: 


hin? = $D ufo? and hy? = (3D, + Hp) or. 


The relative importance of dominance may be assessed either by statistical 
comparison of the two heritability estimates, or by the ratio vV T,[D,) given 
Y Mather (1949). This ratio, however, only provides an estimate of the degree 
of °minance when the gene frequencies are equal at all the loci involved in the 
"Xpression of the measured trait. The variance for either of these tests of 

©Minance js a function of the variance of the difference between the estimates 

x and Hp. This will be referred to later in the discussion of criteria for 
fhe ficiency ot possible sets of data for the estimation of the four main com- 
Ponents of Variation. 
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Inspection of Table 1 will confirm that a number of sets of simultaneous 
equations can be obtained which could be solved to provide estimates of the 
parameters of the genetical model. The present discussion is confined to three 
different combinations of six statistics which constitute minimal sets of data 
enabling the four parameters to be estimated. . | 
Set 1. Monozygotic twins raised together (MZ), monozygotic twins raised 
apart (MZ,), and full-siblings raised together (5,4). 

Set 2. Monozygotic twins raised together, full siblings raised apart (84), and 
full-siblings raised together. 

Set 3. Monozygotic twins raised together, full-siblings raised apart, and half- 
siblings raised together (HS,). А 

For each minimal set of data there are thus three distinct groups for which 
the variances within and between pairs can be estimated, providing the SIX 
which the parameters may be estimated. Since 
to be estimated from six statistics, the method of 
ended as that which makes the maximum use of 


Mather outlines the application of this method to 
second-degree statistics. It has been show 


Statistics, x the vector of second- 


parameters. 


. 3: 4 А six 
Since, for each minimal set of data, there are six statistics, there are 


; қ е 
degrees of freedom available of which four are removed by estimating ме 
parameters. "Тһе two remaining degrees of freedom permit a statistical test 


А ^ Я а 
the assumption of random mating, and the equality of the environment 
components for twins, siblings and half-siblings, 


As far as possible, an experiment j 
population by the method of least sq 
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dataset. For experiments of comparable size the precision and the correlations 
of the estimates depend greatly upon the model assumed and on the structure 
of the experiment. Conclusions of a qualitative nature about the efficiency of 
а given data set for the estimation of the components of a model may be inferred 
from an examination of the expectations for a given set of statistics in terms of 
the parameters of the model assumed. This may be illustrated by reference to 
the first six statistics for which expectations are given in Table 1. These form 
the first minimal set of data given previously. It may be seen that the within-pair 
variance for monozygotic twins reared together is an estimate of E, alone, and 
Consequently the estimation of this parameter will be the most efficient. This 
has been demonstrated consistently during the reanalysis of published data even 
when the experimental design is inadequate. It is noticeable that the coefficients 
of E, and E, are virtually uncorrelated, with the result that little or no correlation 
would be expected between the estimates of these two parameters. This is not 
the case with D „ and Hp, however. The coefficients of the parameters in the 
expectations of the statistics are highly correlated. This suggests a high negative 
Correlation will be detected between estimates of D, and H p 

In any actual analysis the correlations between the estimates can be calcula- 
ted directly from the elements of the inverse matrix thus: 

Тат = Anm/V(Amm . Ann), 


where rpm is the correlation of the estimates of m and n, Anm is the corresponding 
off-diagonal (covariance) element of the inverse matrix, and Amm and Ann are 
the variances of the estimates of m and z respectively, derived from the leading 
diagonal of the inverse matrix. Туріса! values for the correlations, for minimal 
data set 2, are calculated for the worked example given in Section 4. | Reference 
to this example at this stage will confirm some of the conclusions given above. 
The Variances and covariances of the genetic parameters are much larger than 
those of the environmental parameters. It is evident, therefore, that the estima- 
tion of the genetic components of variation is much less efficient than the estima- 
Чоп of the environmental components. With this in mind the ensuing discussion 
Concentrates on the efficiency of the estimation of the genetic parameters, since 
this is the factor which limits the value of all designs. The estimates of D, 
and Н r both have large variances, and the large negative covariance between 
them results in the statistical comparison of Dp with Hp being extremely 
‘efficient, since the variance of the difference between them, 
V(D, — Нь) = V(Dx)—2 cov (DpH x) + (Н), 
5 greatly inflated by the negative covariance of D, and Ны | 
€ magnitude of the variance of the difference between D, and H, is 
thus ап indication of the precision with which a given set of data can distinguish 
Additive from dominance variation. Since the estimation of environmental 
Components is demonstrably more efficient than the estimation of genetic 
Components, the value of Г(О,-Н,)іза suitable guide to the overall efficiency 


ar ; - 
! experimental design. 
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3. THE Метнор Or Tuis STUDY 


Computer-simulated weighted least squares analyses were conducted to inves- 
tigate the effect of various factors on the efficiency with which additive and 
dominance variation can be separated. For each of the three minimal sets of 
familial relationships data were generated for three levels of broad heritability, 
namely, - 
(£D &H j)/ (5D jo 3H y - E, +E) 20-1, 0-5 and 09, 
and for each of three values of y(H,/D p), namely 01, 0-5 and 1-0. 

By assuming unit total variance and uniform distribution of environmental 
variation within and between families, i.e. E, = Е, it is possible to calculate 
the relative values of the four parameters, D p H p E, and E, for each combination 
of level of heritability and ratio of Hp to D,, (see Section 4). 

Тһе mean squares obtainable from the analysis of variance of the hypotheti- 
cal population can then be calculated by combining the relative values of the 
four parameters according to their contributions to the expectations of mean 
squares given in Table 1. It is then possible to carry out the first part of the 
weighted least squares analysis on these 'data' to obtain the variance- 
covariance matrix. The variance of the difference between D, and Hp was 
calculated from the appropriate elements of the inverse matrix, and the reciprocal 
of this value was taken as an index (I) of the efficiency of the design. ‘Thus: 

I=1/V(D,—H,). 

The proportions of pairs of like-related individuals contributing to each 
pair of mean squares were varied between separate simulations. ‘The proportions 
in two of the groups were allowed to take all reasonable values between 0-1 and 
0:7, Бу gradations of 0-1, with the proportion іп the third group fixed by the 
restriction that the three proportions should sum to unity. It was assume 
that the sample size was sufficiently large for the degrees of freedom within and 
between pairs to be considered equal. 

Each statistic was weighted by its corresponding amount of information, 
which was taken to be 1/еу, where ej is the theoretical error variance of the 
statistic, calculated from: 

eij = 2V?y[n;. 
Ру is the jth statistic estimated from the ith group of like-related pairs of which 
m pairs are measured. 

An approximate idea of the optimum relative proportions was obtained by 
inspection of the values of / generated during the first part of the simulation 
program. More precise values were obtained by repeating the simulation 
intensively over a smaller range of relative proportions of the three groups eg 
pairs comprising the minimal set investigated, Р 

Тһе computations were conducted in full for each of the three minimal sets 
of data, for all the possible combinations of heritability and ratios of Hp t° Dr 
The necessary computations were carried out on the KDF9 computer В 
Birmingham University. 


oo ee pee 
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4. WORKED EXAMPLE 
The basic computational procedure involved in the simulations will be 
illustrated in outline for one example. The MZ, S 4, Өт Set of data is considered 
with the relative Proportions, z; of the three groups fixed at 0-2, 0-5 and 0-3 
respectively. 
Assuming unit total variance, /? —0-5, A/(H ,/D,) — 1-0, and £,=£,: 
(3D4 4H y)/o,? — 0-5 


but 
ст?= 1:0 
Drt łHpg=0:5. (1) 
УНР) — 1:0, 

whence 

Н,Ы,-10 
апа 

Ә,-Н,-0. (2) 
Solving the two simultaneous eqns. (1) and (2) for Н, gives 

15Н,-10, 
Whence 


Hy, Dg 0-667. 
Also, since E, = E, and E,--E—1 — Aj, 
E, =E, = (1—452)/2— 0:25. 


Тһе expectations of the six statistics of this set of data in terms of the four 
Parameters for the A matrix are given in Table 2. 


^ AN 
TABLE 2, EXPECTATIONS IN TERMS OF THE FOUR PARAMETERS FOR THE À MATRIX 


Parameter 

Statistic Dr Hn 5, А 
М2 between pairs 1:00 MELOS 1-0 0-0 
within pairs 0790 eim 10 10 
S4 between pairs 0-75 5. 1:0 1:0 
within pairs 0-25 els 10 20 
Sr between pairs 0:75 Hed 10 0-0 

within pairs 0:25 ` 


The vector of parameters, 8, is Dy, spl Fe da T deron 
0-250, АӨ gives the vector of statistics (x) which form the : à ейун 
elements ОЁ w, the matrix of weights, are calculated from 1/ei; as descri 
“ction 3, See Table 3. EIL | 
The information matrix, calculated from M = (АА), is p ge 
The inverse of M is the variance-covariance matrix, given in Table 5. 
fla; 


: inverse are the covariances of the 
5 ‘agonal elements in the upper half of the i 4 
.P, 
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TABLE 3. VARIANCE (x) AND WEIGHT (w) FOR THE VARIOUS ITEMS 


Item Variance (x) Weights (w) 
MZ, between pairs 1-750000 0۰032653 
within pairs 0۰250000 1۰600000 
бл between pairs 1:208333 0-171225 
within pairs 0:791667 0-398892 
ӛт between pairs 1:458333 0:070531 
within pairs 0:541667 0:511243 


TABLE 4. INFORMATION MATRIX CALCULATED FROM M —(A'wA) 


Dr 0-225524 0-115651 0-441504 0-399244 
Hn 0-063769 0-262526 0-205035 

2 0:784544 0-776485 
E; 0-982853 


TABLE 5. INVERSE or M 


DR 77:1064 — 125-1624 1:3181 -6:2524 
Нр —0:8766 264-3666 —4:9737 —0:3785 
E, 0:1915 —0:3902 0:6144 0:0168 
E, —0:3740 — 0:0003 0:0112 3:6229 


estimates, and the corresponding elements in the lower half are replaced by the 
correlations between the estimates calculated as in Section 3. "Тһе correlations 
between the estimates are printed in italics. 


I is now calculated as 1/V(D,) —2 cov (Dr. Ha) + (Н): 
1-1/(77:11--250-32--264-37) 
=1/591-80 
= 0:0016879. 


For ease of comparison the values of 1 obtained were multiplied by 1000. ‘Thus 


for this set of data under the given assumptions about heritability and dominance 
the index of efficiency obtained, I, is 1-690. 


5. RESULTS AND CONCLUSIONS 

The variation of I with differing proportions of three types of pair constitut- 
ing a given set of data showed a consistent pattern for all three minimal sets of 
data investigated. Table 6 represents two sets of results for intermediate values 
of heritability and dominance. The results are given for the first stage of the 
simulation only, for the first and second minimal set of data. It is clear that the 
index of efficiency, J, has a single optimum value for a given set of data, and that 
extreme departure from the relative proportions of the three groups of pairs s) 
which this optimum is generated will lead to considerable loss of statistica 
efficiency. More precise optimal proportions and values for 1 for all three age 
of data and for all nine possible combinations of heritability and dominance 276 
given in Table 7. The results given here lead to four principal conclusions: 
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TABLE 6. V ALUES OF J For Data Sets 1 AND 2 ASSUMING THE BROAD HERITABILITY 
TO BE 0-5, AND THE DOMINANCE RATIO TO BE 0-5 
(The upper value of each pair refers to the MZ, MZA, Sy set, and the lower value (in italics) 
refers to the MZy, Sa, Sp set. Values of 1 are multiplied by 10°.) 


Proportion of MZ, 


0-1 0-2 0-3 04 0:5 0-6 07 
0:1 0:3607 0:3159 0-3187 0:3177 0-3125 0-3002 
1:2610 1:2302 1:1645 10679 0-9344 0-7432 
0:2 0:4438 0:5200 0:5446 0-5486 0:5369 0:5031 0:4143 
1:6184 16753 15809 1-4163 11920 0-8994 0:5160 
0:3 0:5490 0:6670  0-7007  0-6929  0-6425 0-5062 
1:7269 1:7427 1-5696 1-3058 0-9640 0-5372 
Proportion of 0-4 0:6157  0:7594 0:7849 0-7333  0:5647 
Sy 16798 1:6189 1-3584 0-9971 0:5478 
0:5 0-6526 0:7972 0:7781 0-5992 
1:5270 1:3652 1:0112 0:5532 
0:6 0:6599 (0:7629 0:6114 
1:2862 10070 05551 
0:7 0:6260 05928 
0:9592 0:5525 
0-8 0-5025 
0-5370 


1. Of the three minimal sets of data investigated, that which incorporates 
half-siblings is marginally better than the MZ,,5,, S, set. This, however, 
Must be weighed against the possibility that equality of the environmental 
Components between half-siblings and pairs of the other degrees of relationship 
18 less likely because half-siblings have been subject to a greater range of environ- 
Mental variation within families since half-sibling pairs have only one common 
Parent, What is more evident from the simulations is that both the MZ,, S S, 
Sct, and the MZ,, 54, HS, set are considerably more efficient than the MZ, 
М2), S, set, is 

2. Тһе conclusions above are valid for all the values of broad heritability 
and V (Hy/D,) considered, though it must be conceded that the difference in 
ficiency betwen the three sets of data is less when the heritability is high. 

3. The efficiency of the separation of D, and H т is greatly increased when 
the heritability is high, but, for a given level of /i?, it can be seen that the effect 
alterin th H ,/D,) is slight. ЖЕ | . 

‚> For iter sii n dee the number of twins required to give 
PPtimum efficiency is smaller, but the number of full-siblings is correspondingly 
creased. MS 
т It is arguable that the presentation of precise optimum proportions is of 
Ше Use, since it is unlikely that an experiment would be necessary if the level 
of heritability and the m of gene action were known sufficiently precisely to 
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enable designs as exact as those described above. It is probable, too, that an 
experiment would be designed to investigate the genetic and environmental 
influences on a wide range of characters for which the heritabilities are likely to 
vary greatly. In this case the best procedure will be to accept a compromise 
solution and to design the experiment on the assumption that the broad herit- 
ability is in the intermediate range. Examination of the estimates of heritability 
already obtained from twin studies may provide some indication of the importance 
of genetic factors in determining variation for a wide range of measurements. 
Inspection of the heritabilities obtained for the Michigan Twin Study (Vanden- 
burg, 1962) and the Havard Twin Study (Gottesman, 1966), for example, 
Suggests that most heritabilities lie between 0-0 and 0-5, though the inadequacy 
ОЁ such estimates must be borne in mind. 

The results of such a compromise are given in Table 8, for the minimal data 
Sets 1 and 2. It was assumed that the heritability was about 0-5, and conse- 
quently the relative proportions of the relatives were fixed at values which gave 
reasonable efficiency at this level of heritability. The frequencies for set 1 were 
fixed at 0-2, 0-3 and 0-5 for MZ,, MZ, and 5, respectively, and at 0-2, 0-5 and 
03 for MZ,, S, and 5, for set 2. Values of J were then obtained for these 
fr €quencies for cach of the combinations of heritability and dominance invest- 
Igated, 


TABLE 8. Tue EFFICIENCY OF A COMPROMISE DESIGN COMPARED WITH THE 
Махімом EFFICIENCY OBTAINABLE UNDER THE SAME CONDITIONS 


(The Proportions of MZ and Sq were fixed at 0:2 and 0-5 respectively for the М?т, MZA, 
Sy set, and at 0:2 and 0:3 for the MZ, Sy, Sp set. Values of J are multiplied by 10°.) 


Set 1 Set 2 
h? V(AR/Dp) Compromise Optimum Compromise Optimum 
б j 3774 1:3934 
01 0-4482 0-5022 1 
0-5 0-4472 0-5007 1:3736 1:3900 
1:0 0-4454 0-4975 1:3679 1:3829 
e 4 7699 1-7986 
0-1 0-8090 0:8191 1 
05 0-7974 0-8066 1۰7452 1-7739 
1-0 0-7739 0-7811 1:6920 1:7220 
ы 1978 2:4980 
01 1-4880 1:9872 2 
05 1:4326 1:9438 2:1367 2:4425 
10 1:3210 1:8503 2:0120 2:3206 


tio It will be noted that the compromise does reduce the efficiency of the сч 

lee. of Hy and В, when the heritability is appreciably different from the value 

о ed, hy? — 0-5, although the values obtained for J are tolerably close to the 

Ptimum values. The conclusions arrived at earlier are still tenable since it 

th be seen that even a compromise design for minimal set 2 is more efficient 
" the best design attainable for set 1 under any circumstances, 
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Certain practical observations about the design of experiments for human 
psychogenetics are appropriate in the light of the foregoing discussion. АП 
three minimal sets of data enable estimates of the four main components of 
continuous variation to be obtained, and permit a test of the adequacy of the 
genetical model if the method of weighted least squares is adopted. It has been 
shown, however, that the set which requires monozygotic twins reared apart 
(set 1) is considerably less efficient than the two alternative sets over the whole 
range of heritability and dominance studied. For an experiment of given size, 
the minimal sets which incorporate foster-siblings are between two and three 
times as efficient as the set which includes monozygotic twins reared apart. 
This may be clarified by a numerical illustration: a design using 100 pairs of 
subjects, of which 20 pairs were monozygotic twins reared together, 50 were 
foster-siblings and 30 were full-siblings reared together, is expected to produce 
estimates of the dominance ratio as efficient as a design using 250 pairs of subjects 
of which 50 were monozygotic twins reared together, 75 were monozygotic 
twins reared apart and 125 were full siblings reared together. It is evident 
from this that to obtain equal efficiency for the estimation of the genetic com- 
ponents of variation. far more monozygotic twins are required for the MZ,» 
MZ, S, set than for the MZ,, 5,, S, set. 
From the point of view of cost-effectiveness, therefore, statistical and 
practical considerations coincide in favour of designing Dêr NERE in human 
psychogenetics which incorporate foster-siblings rather than monozygotic twins 
reared apart. "The simulations confirm that much of the money and effort at 
present directed towards the collection of twin data might profitably be diverted 
to the collection of data on foster-siblings. "Тһе latter are certainly easier tO 
ascertain and, except in cases of doubtful paternity, they do not require the 
expensive and time consuming procedure of zygosity diagnosis which is essential 
for twin studies. In any case it has been remarked that the use of foster-siblings 
substantially reduces the number of twins required to give results of comparable 
efficiency. f 
A situation might be envisaged in which financial resources permit a large 
experiment to be planned although the availability of twins and foster-sibling® 
is limited, with the result that they could not be ascertained in sufficient number? 
to give the proportions necessary for optimal efficiency. Under these circum" 
stances it would be possible either to maintain the total size of the experiment 
by the inclusion of extra full-siblings reared together, or to reluce the total 5176 
of the experiment so that the proportions of the differ ' 
to those required for optimum efficiency. It is necessary, therefore, to investi" 
gate the opposing influences of experimental size and disproportionate numbe p> 
of the groups which constitute the set of data on the efficiency of the estimatio? 
procedure. 
This was the subject of the second part of the simulation program. 
both minimal sets 1 and 2 a simulation was started assuming an experiment : 


=н . : Я nt 
unit size and the relative proportions of the pairs of individuals of the differ? 


ent groups were compara". 


For 
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index of efficiency, I, was computed for both sets of data, for all the nine com- 
binations of heritability and dominance ratio stated before, for successive 
Increases in size of the experiment. After the experiment had increased in size 
by a factor of 20 the change in 7 was sufficiently small for the value of 1 at this 


MZT MZ, Sr 


10 2 
RELATIVE SIZE OF EXPERIMENT 


FIGURE 1. The effect on efficiency, J, of increasing the relative size of the experiment solely 
by adding to the number of full-sibling pairs reared together. Graphs of I against 
relative size of experiment for the two minimal sets of data, М?т, М2д, Sr and 


М2, 54, Sr, when the broad heritability is 0-9, and the ratio «(А n/D) is 1-0. 


_ Fig.listhe graph of the change in J with increasing experimental size for a 
Blven value of heritability and dominance ratio, for the two sets of data investiga- 
ted. The efficiency is seen to increase sharply at first, but when 2 —5 most of 
the possible improvement in efficiency has already been realized. | 

The value of 7 after a twenty-fold increase in the size of the experiment was 
adopted as a stable value. The difference between this value and the efficiency 
for an €xperiment of unit size when the proportions are optimal, z=), was 
the range of improvement of efficiency which can be realized solely by adding to 
the number of full-siblings reared together. The increase in efficiency which 
was realized after increasing the total size of the experiment by a factor n, 
(Һ-4 1), Was expressed as a percentage, фу, to the total possible gain in efficiency: 

n = 100(In — 7,)/ (Zao — 1). . 
In or der that the different sets of data may be compared with respect to their 
Sensitivity to changes in experimental size and disparity of proportions, the overall 
&ain in efficiency was expressed as a percentage, g, of the final efficiency, 1,0: 
g = 10007, — LL. 


The Values obtained for g, and selected values for pn are tabulated in Table 9, 
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Examination of these results leads to the following conclusions. 


1. Тһе efficiency of the experiment is improved by increasing the overall 
number of pairs of individuals even at the expense of introducing deviation from 
e apimi proportions of the three types of related pairs included in the set 
ot data. 


. .2. Most of the possible increase in efficiency is gained by а five-fold increase 
їп the size of the experiment. 


3. Тһе proportional gain in efficiency, g, is greater for the MZ,, MZ,, S, 
set than for the MZ,, S,, 5, set, and when the heritability is high the value of 
Т.) is greater for the first set of data than for the second set. 


4. When the level of heritability is constant the proportional gain in efficiency 
depends very little upon the magnitude of 4/(H ,/D,). 

Whatever value of n is accepted as the limit for the worthwhile increase in 
size of an experiment, it is possible at this limit to derive, from the proportions 
of the various classes of familial relationship in each set of data, some conclu- 
sions about the minimum proportions of the rarer classes which can be tolerated 
before at least some waste of time and effort is involved. If the proportions 
available be smaller than those tolerable, then the overall size of the experiment 
can be reduced confidently by the exclusion of full-siblings raised together, to 
Save money without any loss of efficiency. 

Тһе discussion of this issue is made less precise by the demonstration that 
the optimum proportions of the different types of pair constituting a given 
minimal design are considerably influenced by the heritability of the character 
concerned. It has already been shown, however, that an approximation to the 
optimum proportions can give acceptable efficiency over a wide range of values 
for the broad heritability. It was suggested that for an experiment of given 
Size, proportions of 0-2 (MZ,), 0:3 (М2,) and 0:5 (Sy) give reasonable efficiency, 
Within the limits imposed by this relatively inefficient set of data. Similarly, 
Proportions of 0:2 (MZ J, 0:5 (S4) and 0:3 (Sy) give acceptable efficiency for all 
Values of 2%, The combined proportion of the rarer groups (MZ, and MZ,) 
Tequired for optimal efficiency with the first set is thus 0:5, and for the second 
бес the combined proportion of MZ, and S, is 0:7. . 

Tt was shown earlier than when the numbers іп the rarer groups are restricted, 
ost of the gain in efficiency obtainable by increasing the number of full-siblings 
reared together is achieved within a five-fold increase in the total size of the 
ү periment. At this stage the relative proportions of the rarer groups have 
allen to 0-5/5 =0-1, in the case of the first set of data, and to 0-7/5 =0:14, for the 
Second set, The implication of this is clear: if the combined proportion of the 
rarer groups of related pairs is below 10-15 per cent it is likely that unnecessary 
6 Опеу and effort із being expended оп a large experiment, the size of шш 
beim be Substantially reduced without prejudicing the efficiency of the statistica 

ysis. 
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6. SUMMARY AND RECOMMENDATIONS 


The clarity and value of the biometrical-genetical approach to the study of 
continuous variation in a human population has bcen briefly surveyed, and a 
more detailed study of particular minimal sets of data has been conducted in 
order to provide some criteria for the design of future experiments. 

Jinks & Fulker (1969) suggest that the existing approaches to the psycho- 
genetical analysis of human behaviour be abandoned in favour of the biometrical 
approach. This investigation has studied the implications of the biometrical 
method for experimental design. It is appreciated that the investigation is not 
exhaustive and that different criteria might be adopted and different designs 
tested, but it is hoped that the method of this study will promote greater concern 
for the structure of experiments in human psychogenctics. 

It has been shown that the weighted least squares estimates of additive and 
dominance genetic variation, combined in the least efficient manner, i.e. by 
difference, are more efficient when they are estimated from a set of data based 
on pairs of monozygotic twin reared together, full-siblings reared apart and full- 
siblings reared together, than when estimates are obtained from a set of data of 

comparable size based on measurements taken on monozygotic twins reared 
together, monozygotic twins reared apart and full-siblings reared together. 

The statistical conclusions are reinforced by economic considerations, 
since the design involving foster-siblings requires fewer twins for results of 
comparable efficiency than the design which incorporates monozygotic twins 
reared apart. "The foster-twin design therefore obviates some of the expense 
involved in twin ascertainment. An attempt has been made to assess the effect 
of limited availability of pairs belonging to the rarer familial relationships. It 
has been demonstrated that even when such pairs are relatively infrequent large 
experiments are still desirable, providing that the combined number of the rarer 


pairs is not less than 10-15 per cent of the total number of pairs included in the 
experiment. 


ACKNOWLEDGEMENTS 
This work was carried out while the author was supported by an S.R.C. 
research studentship. The work is part of a programme of research in psycho- 
genetics in the Departments of Psychology and Genetics, supported by a grant 
from the Medical Research Council. The author is indebted to Professor? 
P. L. Broadhurst and J. L. Jinks and Dr M. J. Kearsey for advice and comments: 


REFERENCES 


Ca'rrELL, К. B. (1960). The multiple abstract variance analysis equations and solutio 
for nature-nurture research on continuous variables. Psychol. Rev. 67, 353 52- 
Соокв, P., Jones, R. MORLEY, MATHER, K., Bonsai, С. W. & NELDER, J. А. (19 


Estimating the components of continuous variation, I, Statistical. Heredity 17 
115-133. 


d 


The Genetic Analysis of Continuous Variation 147 


GOTTESMAN, I. Т. (1966). Genetic varia 
Psychol. Psychiat. 7, 199—208. 
Hayman, В. I, (1960). Maximum likelihood estima 

tinuous variation. Biometrics 16, 369—381. 
HOLZINGER, К. J. (1929). Тһе relative effect of nature-nu 
J. educ. Psychol. 20, 245-248. 
Jinks, TT B Furker, D. W. (1969). А comparison of the biometr; 
and classical approaches to the analvsis of human behaviour 
Мативв, К. (1949). Biometrical Genetics. London: Methuen. 
NELDER, J. A. (1960). Estimation of variance components in certain types of experiment 
On quantitative genetics. In O. Kempthorne (ed.), Biometrical Genetics. London: 
Pergamon Press. А А 
VANDENBURG, S. G. (1962). The hereditary abilities study: hereditary components in a 
psychological test battery. Am. J. hum. Gen. 14, 220-237. 


nce in adaptive Personality traits, ў. Child 
tion of genetic components of con- 
rture influences on twin difference. 


ical genetical, MAVA, 
(in press). 


Жж 


Vol. 22 The British Journal of Mathematical and November 


Part 2 Statistical P. 
espe atistical Psychology 1969 


А GENERALIZED COMMON FACTOR ANALYSIS BASED ON 
RESIDUAL COVARIANCE MATRICES OF PRESCRIBED 
STRUCTURE 


By Корквїск P. McDonatp! 
University of New England, New South Wales, Australia 


An algorithm is described, the purpose of which is to express a correlation 
matrix as the sum of two matrices, of which one is of rank less than its order, and 
the other, the residual matrix, is of prescribed structure. Applications are 
described to approximate simplex and circumplex matrices, to the factoring of 
groups of variables, and to a form of multi-mode factor analysis. 


1. INTRODUCTION 


Àn examination of the essential elements and basic assumptions of the 
Classical linear common factor analysis model serves to suggest a number of 
directions in which it is possible to proceed, in seeking more general models 
that might be applicable to data that do not meet the requirements of the classical 
model itself. One such direction has been explored by McDonald (1962 b, 
1965, 1967 a, b, c) under the general title of non-linear factor analysis. This 
title is perhaps somewhat misleading, since the more general models so obtained 
are essentially applications of linear algebra, and the models are linear in their 
coefficients, while permitted to be non-linear in the factors or latent traits. 
Another possible direction for generalizations on the classical common factor 
model has been indicated more recently by McDonald (1968). Essentially, this 
amounts to seeking factors common to two or more groups of variables, and as 
Such is a generalization both on inter-battery factor analysis as discussed by 
Tucker (1958 a), and on the classical model, reducing to a treatment of the 
former problem when we have just two groups of variables, and to the latter 
When each group contains just one variable. Practical computing procedures 
for this development (not reported in McDonald, 1968) have now been developed, 
but while this work was in progress, it was realized that yet a further, more 
Beneral, treatment was possible, which includes the case of groups of variables 
аз a specialization of the theory. Essentially, the present theory amounts to this: 
А given covariance or correlation matrix is expressed as the sum of two matrices. 
The first, as in the classical theory, is a matrix of rank less than its order, expres- 
Sible as the product of a matrix of factor loadings and its transpose. The 
Second is а © residual’ covariance matrix of prescribed form. Whereas in the 
Classica model the residual covariance matrix is required to be a diagonal matrix, 
the current development it may have non-zero elements in prescribed non- 
diagonal locations. The special case of groups of variables may be treated by 
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requiring that all residual covariances be zero except those in diagonal sub- 
matrices corresponding to within-group covariances. An additive (linear) 
model for multi-mode factor analysis may also be obtained by an appropriate 
prescription of the locations of zero elements of the residual covariance matrix. 
This is in contrast to the multiplicative (non-linear) model for multi-mode factor 
analysis proposed by Tucker (1963, 1966). With a further slight modification 
of the theory, it can be used in order to yield a common factor analysis of multi- 
category data. This further development will be treated in a separate paper. 


2. THE GENERAL "ГНЕовү 
In order to introduce the proposed generalization, it is convenient first to 
review the elements of the classical factor analysis model. ‘This may be pre- 
sented in several ways, of which three deserve some consideration in the present 
context. 
In the first development, given an nxn matrix R of covariances (in par 
ticular, of correlations), in common factor analysis one seeks a diagonal matrix р", 


at least non-negative definite, and in general positive definite, such that R — D^ 
is of rank r « 1, whence it is possible to write 


R-D?=F@F’, (1) 
where F, a n xr matrix of rank r <x, is the factor pattern, and Ф, a r xr positive 
definite matrix, is the covariance matrix of factor scores. (In the special case 
of the orthogonal model, Ф =I) 


In a second, more fundamental development, the factor model itself (the 
specification equation) is written in the form 


z=c+e, (2) 
where 
с-Ех. (3) 
The пх 1 vectors 2, c, е, are vectors of random variables, each component of 
c, e, being respectively the common part and the unique part of each component 
of z. The vector x is an r x1 vector of random variables, the common factor 


scores. The (non-random) xr matrix F is as previously defined. The 
assumptions of the classical model are, firstly, that 


E (e e')- 0, (4) 

so that by eqn. (3) © 
E{xe’}=0, ( 

and secondly, that 6) 
Efe e’}=D? ( 

is a diagonal matrix, consistent with the previous definition. Writing 7) 
Е{хх'}=Ф ( 


x of the 
yields eqn. (1) from eqns. (2) to (7). Note that the covariance matrix of 
common parts, E {e с”), is of rank r <n. 
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Ina third development, the classical common factor model may be derived 
as a special case of latent trait theory, based on a weak form of the Principle 
of Local Independence (cf. Anderson, 1959; McDonald, 1962 а). In this way, 
factor theory can be related to other latent trait models, such as the latent 
Structure models of Lazarsfeld (see, for example, Lazarsfeld, 1950, 1959; 
McDonald, 1967 a). A derivation, on these lines, of the basic equations of 
non-linear factor analysis, whence the linear model is easily obtained, is given 
by McDonald (1967 b), hence this point need not be developed fully here. 
It should suffice to note that a very general model can be based on the Principle 
of Local Independence, which yields non-linear and linear factor analysis, the 
latent class model, and a number of other latent trait models as special cases. 
However, in the generalization of theory to be described in this paper, the 
Principle of Local Independence is abandoned. 

Тһе central feature of the generalized factor model to be described, is that 
the residual covariance matrix. 


Rres=R-FSF’, (8) 


is not assumed, as implied by eqn. (1), to be a diagonal matrix. A convenient 
device for describing the model makes use of the elementwise product of 
matrices. Let 
Z-XsY-[zi] = [Хуу], (9) 
Where each of the matrices X, Y, 2, is of order mx. ‘That is, the ( ыға 
element of Z is the product of the (j, k)th elements of X and of Y. An extremely 
Simple algebra may be based on this definition of a product. ‘The following 
Properties and definitions should be noted: (a) elementwise multiplication is 
commutative and associative; (0) the unit matrix under clementwise multiplica- 
tion has every element equal to unity; (c) the zero matrix under elementwise 
multiplication is the null matrix of conventional matrix algebra; (d) if X «Y —0, 
then either Xj =0, or vjg—0, for every j, k; (е) define the elementwise com- 
Plement X of a nxm matrix X, such that 
X+X=U, 


Where U is the elementwise unit matrix of order nxm. Then X#X=0 if and 
Only if Xj =0 or 1 (j=1, .... 4; k=l, ..., m); (f) define an S-matrix (a selection 
Matrix) as a matrix such that 5+5=0, or, what is the same thing, such that 
each of its elements is either 0 or 1; (g) if X is a square matrix of order n, then 


(10) 


X«L, = Diag {X}, (11) 
Where I, is the nxn identity matrix of conventional matrix algebra, and 
X«I,=X-—Diag {X}. (12) 


Note that an application of eqn. (12) to eqn. (1) yields 
Rel, =ЕФЕ'*Ї„, (13) 
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which is to say that each non-diagonal element of R is equal to the corresponding 
non-diagonal element of F®F’. More generally, the algebra just described is 
at least a convenient way to describe equalities between selected elements of 
matrices, and may, for example, find application in work on factor patterns 
with prescribed zeros, as in simple structure theory. 'Гһиз, for a prescribed 
S-matrix, S, the equation 

X«S-Y«S (14) 


implies that every element of X, whose location corresponds to a unit in S is 
equal to the corresponding element of Y. - 

On the lines of the first development, above, of classical factor analysis, itis 
now a simple matter to describe the proposed generalization. Let В be a given 
n xn matrix of covariances (in particular, correlations). Let M bea prescribed, 
symmetric n xn matrix (i.e. its elements are fixed and known), to be referred to 
as the model matrix. Let Е be an initially unknown n xn matrix. lt is desired 
to determine E such that E«M is positive definite, or at least non-negative 
definite, and that 

R-E«M-FóF' (15) 

is of rank r «n. Note that 


R«M-FOF «M--E«M«M. (16) 
if M is ап S-matrix, so that M« M = S#Š =0, it further follows that 
R«S-FOF' «8, (17) 


The special case of the classical factor analysis model is obtained by setting 
8-І. Іп this case, eqn. (15) reduces to eqn. (1), and eqn. (17) states that 
each non-diagonal element of R is equal to the corresponding element of ЕФЕ. 
More generally, here eqn. (17) states that each element of R-F®F’ іп 2 
location prescribed by correspondence to that of a unit in S, і.е. a zero in 9» 
should be equal to zero. 

In principle, a number of classical factoring algorithms would yield counter" 
parts for this more general problem, by simple analogy. Two methods have 
been tried, one a generalization on canonical factor analysis (CFA), the other à 
generalization on principal factor analysis (PFA) (Rao, 1955) Ina number 9 
cases, the former of these failed to yield satisfactory results, These failures are 
still being investigated. On the other hand, the generalized PFA procedure has 
given plausible results in most applications made so far, both to constructe dat? 
and to empirical data (an exception is noted below). For the present purpose 
therefore, it will suffice to describe the PFA type of procedure. Essentially" 
it is an iterative algorithm of the type referred to by Harman (1960, P- 89) * 
‘iteration by refactoring '. 

For a prescribed model matrix, M, suppose that the number of factors» 
known, and that E in eqn. (15) is approximated by some initial guessed ™ 
value Ey. Obtain the л eigenvalues Ау, ..., Ау, TENES: к-Е,«М, arraDÉ 


г, 38 
trix 


Se — —— ЕЕ 
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in descending order of magnitude, and let qı, ..., 9», dri» +++) qa, be the cor- 
responding eigenvectors. Let A,=Diag {А}, ..., Ar}, Am = Diag Angie Дар 
І0,: Q4] = [qu ..., qr : 9 >>» qn], so that one may write 


Ar: Q’ 
R—-E,«M-[Q, : A oe [ 9 Jr 
° [or ? ом] H Ал, "т, (18) 
Then choose, as the next approximations to F and to E«M, 
F,=Q,Ai,, (19) 
E; 3M - [R- Q,A,Q',] « M, (20) 


return to eqn. (18), replacing E, by E,, and so proceed, obtaining a sequence of 
matrix values Ey, Е,, Es, ..., which, it is hoped, will converge on the desired 
matrix value E. 

Тһе usual problems of classical factor analysis (see, for example, Anderson & 
Rubin, 1956) of identifiability of the structure, estimation of the model para- 
meters, and testing a hypothesized value of 7, the number of common factors, 
all have their more general counterparts here. "Тһе problems of estimation and 
hypothesis testing are still under investigation. However, it will be seen below 
that analogue estimates yield plausible results. 

In the classical model, it is held, on the basis of a simple count of the 
number of unknowns, that the model can be fitted non-trivially to real data, 


provided that 
r « i22 - 1 — (8 1)?]. (21) 


(See, for example, Anderson & Rubin, 1956.) 
It is expected that in most applications of the theory given here, the model 


matrix M will be an S-matrix. Іп such applications, on the same basis, it would 
follow that this model can be fitted non-trivially to real data, provided that 


r<4{2n+1—[8(n+p)+ 1], (22) 


where p is the number of unit elements in the lower triangle of the (symmetric) 
model matrix S (omitting the diagonal). Further, in such cases, a testable 
implication of the model is that residual covariances whose locations correspond 
to those of zeros in S shall be equal to zero. — | . 

In applications in which the model matrix is not an S-matrix, no useful 
results have yet been established on the problem of identifiability. Also, it 
might at first sight appear that the model has no testable implications! However, 
It is still possible, for such a model matrix, to seek E +M such that the smallest 
"~r eigenvalues of R-E#M are zero (or, in actual practice, very small). 
Applications of this kind are still under investigation (e.g. prescribing a simplex 
Structure for E« M), and will not be discussed here. A CD 3200/3600 computer 
Program, GENPAX (GENeralized Principal AXis factor analysis), has been 
developed, which accepts as input a correlation or raw score matrix and the 
Prescribed model matrix M. Тһе number of ‘common factors’ ғ is either 
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prescribed, or determined by the program to correspond to the number of 
eigenvalues of the correlation matrix that are greater than a prescribed value. 
In the following section, the major applications of the theory that have been 
considered and tried thus far are described and illustrated. | 

Finally, for this section, it will be noted that the second derivation of the 
classical model, given above, yields, as the appropriate modification, the replace- 
ment of the assumption (6) by the assumption 


E(ee') - E«M. (23) 
It is no longer appropriate to refer to the 
It is suggested, as by McDonald (1968) for the case of groups of variables that the 
components of c, of x, and of e, in this context, be referred to, respectively, as the 
major components, the major factors and the minor components. Іп some applica- 
tions it may be desirable to fit a further factor model (classical or generalized) to 


the convariance matrix of the minor components, as given by eqn. (23). 


components of e as unique factors. 


3. SOME APPLICATIONS OF THE THEORY 
3.1. The Linear Serial Correlation Case 


Suppose that a set of n variables has a known natural order, x4, . 
Suppose further that, in addition to the covariance between distinct variables 
that can be accounted for by r «n major factors, there are serial correlation 
effects between neighbouring variables а» 2,4(1</)--1<п). Моге generally, 
such effects may extend over three or more successive variables in the sequence. 
For example, in the factor analysis of time series data, such as in learning studies 
where the variables correspond to successive learning trials (Tucker, 1958 0), 
it is reasonable to postulate that errors on successive occasions and perhaps 
successive-but-one, and so on, will be serially correlated. Allowance can be 


made for such serial effects by prescribing the extent of the effects (over two, 
three, ..., successive variables) and accordingly analysing the data by the pro- 
cedure of Section 2. 


«5$ Ue 


If it is assumed that serial correlation 
only, the S-matrix is a tridiagonal matri 
superdiagonal, and subdiagonal, 
effects, of greater extent, a convenie 
extension of the Kronecker delta 


X, with unit elements in the diagonal, 
ап4 zero elements elsewhere. For such 
nt description of the S-matrix is given by an 
notation. Let 
бізі (/-8|<) 

=0 ([-4. D. (24) 


5 

Ss] =[8 psi). e 

If the integer / is set equal to unity, it is assumed there are no serial correlation 
effects. If 7—2, S is tridiagonal, and serial correlation between гу, 25,1 (1 </’ 


j+1<n) is expected. If 1=3, serial correlation between 2j Baas and 2-2 
(1 <j; J+2 <n) is expected, and so on 


Then, in this case, choose 


it 


d 
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.. А simple constructed example, designed to test the computer program 
vill serve as the first illustration of this case. The correlation matrix is given Т» 
lable 1. The structure of the matrix is obvious. The S-matrix, in Table A 
corresponds to /=2, i.e. it allows serial correlation between successive variables. 
One factor was prescribed. ‘Table 3 shows the behaviour of the eigenvalues of 
R —Е+*5 (cf. eqn. (18)) through ten successive iterations, with the eigenvalues of 
R in the first row. Тһе criterion is the mean square of the rejected eigenvalues. 
1 able 4 gives the major factor pattern and the residual correlations obtained after 
ten iterations. 


TABLE 1. CONSTRUCTED CORRELATION MATRIX 


1:00 0:50 0-25 0-25 0-25 0-25 
1-00 0-50 0-25 0:25 0-25 

1:00 0-50 0-25 0-25 

1:00 0:50 0:25 

1-00 0:50 

1:00 


TABLE 2. PRESCRIBED S-MATRIX 


oom ao 
Sree oco 


оооо - 
ооо нш 
ا‎ 
=u SOBO 


TABLE 3. BEHAVIOUR OF EIGENVALUES DURING ITERATION 


Iteration Eigenvalues Criterion 
0 2-67 1-06 0-88 0-64 0-44 0:30 0:52014 
1 2:06 0-44 0:30 0-08 -0-04 -0:18 0-06470 
2 1:79 0-22 0-16 0-00 —0:01 — 0:10 0-01668 
3 1:66 0-12 0-10 0-00 -0:03 — 0:06 0:00586 
4 1:60 0-08 0-07 0-00 -0:03 — 0-04 0-00263 
5 1:56 0-06 0-04 0-00 — 0-02 — 0:04 0:00137 
6 1:54 0:04 0-03 0-00 —0:02 — 0-03 0-00075 
7 1:53 0-03 0-02 0-00 —0:01 -0-02 0-00042 
8 1:52 0-02 0-01 0-00 —0:01 -0-02 0-00023 
9 1:51 0-02 0-01 0-00 —0:01 —0:01 0-00013 

10 1:51 0:01 0-01 0-00 — 0:00 — 0-01 0-00007 


It will be noted that the model considered here might yield an alternative 
explanation of empirical correlation matrices that bear at least a superficial 
resemblance to a simplex (Guttman, 1954), since the most obvious observable 
Characteristic of a simplex matrix is that correlations in the subdiagonal and 
SUperdiagonal are ‘ high ’ while correlations further from the diagonal are ‘ low ’. 
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А second example of this special case consists in an application of it to a E 
drawn by Guttman (1957), from data obtained by Summerfield & Lubin, hs 
illustrate a simplex. The correlation matrix is given іп Table 5. The t 
not shown, corresponds to that of Table 2 (with the last row and column ies ) 
Again, one factor was prescribed. T he. eigenvalues of К-Е»*5, the fac 4 
loadings, and the residual covariance matrix, after ten iterations, are shown i 
Table 6. Тһе form of the residual covariance matrix indicates two ‘ seria 


links °’, between variables 1 and 2, and variables 4 and 5, but not between 2 and 3, 
or 3 and 4. 


TABLE 4 

Factor 

loading Residual matrix 

0:4947 0-755 0-253  —0:003 -0-003 0-003 0-005 
0:4988 0-751 0:245 | — 0:005 0:001 0:003 
0:5108 0-739 0:239 -0:005 -0-003 
0-5108 0:739 0-245 — 0:003 
0:4988 0-751 0:253 
0:4947 


0:755 


TABLE 5. SUMMERFIELD & LUBIN CORRELATION MATRIX 


1:00 0:46 0:42 0:37 0:23 
1:00 0-56 0-49 0:41 

1:00 0:63 0:55 

1:00 0-61 

1:00 


TABLE 6, SUMMERFIELD & Lusin Data 


Factor 
Eigenvalues loadings Residual covariance matrix 
2:409 0:4524 0795 0165 0.009 09021 -004 
0-052 0:6513 0576 —0:031 —0013 0-019 
0-003 0-9079 0176 -0:070 0:005 
— 0-005 0-7716 0:405 0147 
—0:054 0-6000 


0:640 


3.2. The Circular Sertal Correlation Case 


As in the previous case, it is Supposed that the variables have а know? 
natural order, and that serial correlation effects extend over two or pe 
successive variables. However, in contrast to the previous case, it is В6 3 
assumed that the order is © circular '; with the last variable regarded as a o 
bour to the first, as in the circumplex of Guttman (1954, 1957). This equi” | 
ап S-matrix that is an obvious modification of the S-matrix in the previous c? 


Ў аг 
It is awkward to describe formally, but the general structure of it will be cle 
from the following example. 


a 
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me 7 ү aE а correlation matrix drawn from work by Goodman, and 
ine б. An gia (1957) as an example of a circumplex. Тһе S-matrix 
Beek the € © was prescribed, with one major factor. Тһе eigenvalues of 
dandis! wh и factor pattern and the residual covariance matrix, after ten 
БЕКЕШ ет shown in Table 9. Admittedly, the fit is not impressive (note, in 
г занен » the residual for variables 1 and 4), but the example serves to illustrate 


TABLE 7. GOODMAN CORRELATION MATRIX 


1:00 0-17 0:12 —0:03 0-20 0-31 
1:00 0:48 0-22 0-18 0-20 

1°00 0-32 0-17 0:21 

1:00 0:43 0:34 

1:00 0-61 

1:00 


TABLE 8. PRESCRIBED S-MATRIX 


1 1 0 0 0 1 
1 1 1 0 0 0 
0 1 1 1 0 0 
0 0 1 1 1 0 
0 0 0 1 1 1 
1 0 0 0 1 ii 


TABLE 9. GOODMAN DATA 


ч Еасїог 

Eigenvalues loadings Residual covariance matrix 

1:371 0:193 0:963 0108 0062 —0128 0071 0481 
0:167 0:319 0:898 0:385 0058 -0032 —0:012 
0:040 0:299 0:910 0168 0029 0-011 
0-008 0-507 0-743 0092 0:003 
0-035 0:667 0:555 0167 
0:176 0-664 0-558 


3.3. The General Inter-group Case 
i Consider the case of n variables, classified on some experimental basis into 
P distinct groups, containing in turn, 75, a ..., п Variables, so that 
2 
У т=п. 
g=1 
г тау be desirable to account for the correlations between variables drawn 
rom distinct groups with a minimum number of (major) inter-group factors. 
ter all covariance between pairs of variables drawn from distinct groups has 
is “explained ’ by the inter-group factors, there may remain covariance 
Ween distinct variables within any one group, that can be ‘ explained’ by 
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further (minor) intra-group factors. The case where there are и two Loon 
of variables was first treated xj tw (1958 a), and has been further con 
у Gi 960) and Kristo 7). Өте 

ы үт e anie the case سا‎ one is given more than two groups is ы 
in principle by McDonald (1968). А practical procedure for this case E ei 
very simply from the present theoretical development. Let the (j, һе a = 
of the S-matrix be unity if variables j, А belong to the same group, and zero 
otherwise. If the p groups of variables are arranged in order, the S-matris 
consists of elementwise unit submatrices, of orders ж X Niss s 10 00 10 
diagonal submatrices, and null submatrices elsewhere. It is not, however; 
necessary to order the variables in this way. 


Data from Meyer & Bendig, used by Harris (1962) in a discussion 24 
problems of change, will serve to illustrate this case. The correlation iun 
is given in Table 10. Variables 6, 7, 8, 9, 10 are, respectively repeated a 
(at grades 8 and 11) on the cognitive variables 1, 2, 3, 4, 5 (Primary Menta 
Abilities, V, S, R, N and W). The S-matrix, accordingly, is that shown 1n 
Table 11. Тһе programme was instructed to retain factors corresponding = 
eigenvalues of the correlation matrix that are greater than unity. This g 
three major factors. The eigenvalues, the factor pattern and the residus 
covariance matrix after 20 iterations are shown in Table 12. It should be notec 
that this procedure can be extended, without any new considerations, to treat 
two or more superordinate and subordinate groupings of variables. 


TABLE 10. MzvER-BENDIG CORRELATION 


1000 03% 0420 0527 0385 0813 0351 0422 0405 024 
1000 0331 0139 0404 0347 0655 0420 0137 0153 

1000 0380 0204 0489 0203 0.748 0395 017! 

1000 0237 0-584 —0.036 0456 0:732 0150 

1000 0-319 0414 0265 0188 0431 

1-000 0-342 0457 0:555 0038 

1-000 0477 0062 015 

1-000 0:542 0197 


MATRIX 


: 0:162 

too 046, 
TABLE 11. MEYER & BENDIG S-Marnix 

1 0 0 0 0 1 0 0 0 0 

0 1 0 0 0 0 1 0 0 0 

0 0 1 0 0 0 0 1 0 0 

0 0 0 1 0 0 0 0 1 0 

0 0 0 0 1 0 0 0 0 1 

1 0 0 0 0 1 0 0 0 0 

9 1 0 0 жй mq d$ 0 0 

9 0 T бф 5$ d 4 0 0 

0 0 0 1 о о p 0 1 0 

0 0 0 0 1 0 0 0 0 1 
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3.4. A Linear Multi-mode Case 

Тһе illustration of the previous case can be considered to yield an extremely 
restricted form of multi-mode factor analysis (cf. Tucker, 1963, 1966). Suppose, 
for example, that the raw data consist, as in the example just given, of л repeated 
measurements on f tests for a large number of subjects. Then ап analysis, as 
just described, may yield factors common to the р distinct tests, treated as р 
groups, each containing 7 variables. Each of the f residual covariance matrices 
may then be analysed, for п sufficiently large, according to the classical common 
factor model, to yield factors common to the л repeated measures, on each test in 
turn. This treatment is, of course, not symmetrical in tests and occasions of 
measurement. One might equally consider the converse procedure, first obtain- 
ing factors common to the ^ occasions of measurement, treated as л groups, 
each containing p variables, and then obtaining factors common to the p tests on 
each occasion. 
. A less restrictive procedure, which is symmetrical in the modes of classifica- 
поп, is to choose a selection matrix as follows. Let the (j, k)th element of the 
S-matrix be unity if variables j, k are measures on the same test, or on the same 
Occasion, and zero otherwise. If the variables are ordered either in terms of 
tests, or in terms of occasions, with a fixed order of the occasions or tests within 
each group, the diagonal submatrices of the S-matrix will be elementwise unit 
submatrices, and the off-diagonal submatrices will be identity submatrices (see 
example). The analysis using such a selection matrix yields factors common to 
occasions of measurement. The residual covariance 


Raw data from a study by Mitchell & А lied 
by the authors. This з of ета (1968) were kindly supp 
smell (s), taste (f) and irritance (i), for three alcohols, viz 
(E), and propanol (P), u » VIZ. 
the data obtained under condition one onl 


factors were prescribed, yielding after ten iter 
eigenvalues given in Table 15. 5 
prescribed, were then perform 
elements from the residual covariance matrix after removal of the two maj? 
factors. These are the (s, f, i) matrices for each of M, E and P, and the (M, № Р) 
matrices for each of s, t andi. Тһе additional factors so obtained are also show? 
in Table 15. The residual covariance matrix, after removal of the two majo 
factors and these six additional factors, is given in Table 16. Note the t"? 
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м TABLE 13. MITCHELL & GREGSON CORRELATION MATRIX 
-— $ “ x 
— Mt 1000 dfe 0-552 0699 0:581 0:353 0:784 0:779 
Mi 000 0682 0:583 0:546 0:508 0-560 — 0:734 
Es 1-000 0:597 0:337 0:729 0:351 0393 
Et 1:000 0-448 0292 0455 0470 
* Ei 1:000 0-223 0:560 0532 
Р; 1-000 0027 0-116 
Pt 1:000 0:897 
E Pi 1-000 
ndi 
TABLE 14. PRESCRIBED S-MATRIX 
1 1 1 1 0 0 1 0 0 
" 1 И 0-1 6 їй Tg 
1 1 1 0 0 1 0 0 1 
, i об 1 1 1 £ 0 9 
б d 0 i 1i 1 9 1 0 
0 0 1 1 1 1 0 0 1 
1 0 0 1 0 0 1 1 1 
^ 0 1 0 0 1 0 1 1 1 
i 0 O06 t 06061 1 1 1 
TABLE 15. EIGENVALUES AND FACTOR PATTERN 
Eigenvalues Factor pattern 
491 Ms 0879 -0275 0069 0-151 
166 Mt 0861 0016 0463 0-078 
x 008 М: 0:887 0465 -0216 
004 Es 0652 —0-041 0-082 0:752 
001 Et 0567 —0:306 0:592 0:761 
000 к 0:603 0:661 0:122 
—0:01 Ps 0678 —0:568 0:354 — 0:005 
—004 Pt 0:720 — 0:504 0:369 — 0:023 
-009 Pi 0722 0505 0-451 
3 TABLE 16. RESIDUALS 
0-124 9039 0:085 0001 -0002 0005 0033 0:008 
0.038 0041 0022 0003 —0:022 —0:005 0-124 
—0131 0038 -0:178 —0001 0014 —0011 
0-001 0017 -0%064 —0007 —0-020 
—0:345 0:012 0:002 —0013 
0-029 —0-006 0-015 
0092 -0-008 
0-091 
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0:496 
0-649 
0:917 
0:395 
0-280 
0:770 
0:366 
0:434 
1:000 
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negative elements in the diagonal of this matrix. It is clearly absurd, of course, 
to account for a 9 x9 correlation matrix with a total of eight factors, hence the 
defects in the results are hardly surprising. 


4. Discussion 


А good deal of further work is needed on the theory described above. "Тһе 
major problems requiring further investigation have already been indicated, 
namely the problems of identifiability, estimation and hypothesis-testing. ‘The 
generalized canonical factor analysis procedure, which was also tried, seemed 
well behaved in the case of groups of variables, but not in the other cases. The 
difficulty here seems to be related to a failure of the matrices E«S to remain 
positive definite, in the course of the (canonical) iterative sequence. It is known 
(Bellman, 1960, p. 94) that the elementwise product of positive definite matrices 
is also positive definite, but at least in some of the cases of interest, the desired 
selection matrix is not positive definite. Further investigation of the character 
of such elementwise products is therefore desirable. 

On the other hand 


usable results can already be obtained by the above procedures. 


, rather than the more refined statistical pro- 
The present theory can 
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THE COMMON FACTOR ANALYSIS OF MULTICATEGORY DATA 


By Roperick P. MCDONALD? 
University of New England, New South Wales, Australia 


An algorithm is described that serves to fit a linear latent structure model 
to multicategory data. The model has the formal properties of a (generalized) 
common factor model, in contrast to principal component analysis. An empirical 


example is given. 
1. INTRODUCTION 

estion of applicability of factor analytic techniques to what has 
variously been called qualitative data, category data, multichotomous (including 
dichotomous) data, discrete data, or, simply, items, has been bedevilled by a number 
of controversies. Writers such as Guttman (1953) and Lazarsfeld (1950) have 
tended to emphasize the distinction between such data and quantitative or 
continuous data, and to argue in principle that techniques such as factor analysis 
are essentially appropriate to continuous variables, and hence inapplicable to 
qualitative data. Burt (1950), on the other hand, has argued that from the early 
work of Yule onward it has proved advantageous to treat both kinds of information 


in the same terms, wherever possible. 
In the special case of dichotomous data, discussions of the applicability of 


factor analysis were for the most part concerned with what seemed to be the 
problem of choosing a © proper ' measure of association between such variables, 
particular reference being made to the matter of difficulty factors (Ferguson, 1941; 
Wherry & Gaylord, 1944; Carroll, 1945; Gourlay, 1951; Dingman, 1958). 
A resolution of both these problems is at least implicit in the work of Lazarsfeld 
(1950, 1959) on latent structure analysis, though the relevant conclusions that 
may be drawn from this work are somewhat obscured by Lazarsfeld’s use of a 
terminology that emphasizes differences between dichotomous items and 
© quantitative variables. In consequence, such identities as emerge between 
aspects of latent structure analysis and of factor analysis are regarded as having at 
most a purely formal and, one might say, an accidental character. There is 
certainly a formal identity between the model equations of the Spearman case in 
factor analysis and Lazarsfeld’s latent linear model for dichotomous items, and, 
more generally, a formal identity between the model equations of multiple factor 
analysis, and a correspondingly general latent structure model (see McDonald, 
1967a). Essential differences arisc at the level of sampling theory, where 
assumptions with respect to the multivariate distribution function of dichotomous 
items must necessarily differ from those made for multivalued variables. 
Present address: Ontario Institute for Studies in Education, Toronto, 5 Ontario, Canada. 
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It would now seem well established (McDonald, 1967a) that matrices of 
either phi-coefficients, or of the covariances of dichotomous items, scored HAT e 
unity, may be factored in the usual sense of multiple common factor analysis, an 
the results interpreted in the usual way, provided that the assumption of E 
of regression is met. That is, the conditional probability of cach specified 
dichotomous character must be a linear function of the (one or more) factors or 
latent traits. The model parameters obtained by the factoring of covariances can 
be identified with the model parameters of Lazarsfeld's linear latent structure 
models. In practice, the assumption of linearity may not be met, and the analysis 
will then yield ‘ difficulty ' factors, which can be identified with orthogonal 
components of a curvilinear regression of the dichotomous items on the factors 
(Burt, 1953; McDonald, 1967a). Ап 


by the perfect. scale (Walker, 1931; Guttman, 1950), as has been indicated by 
McDonald (1967a). Procedures fo 


а, Which at least appear to 
Guttman was Primarily concerned with the 
ems, hence with what is essentially a non-linear 
implicitly assumed a linear 
ntrasts іп the bipolar components qua substan- 
ation, hence, implicitly, in terms of more than 
both writers were concerned with what is now 
as component analysis, in contrast 


5 paper is to outline a procedure йө 
multicategory data. The distinction in the case О 
ctor analysis and component analysis, will be made 


by McDonald (1969; see also McDonald 
placed upon certain row-sums їп the facto 


: : T Pattern. The theory suggests 20 
algorithm for solving this problem, Ап empi 


rical illustration is given in Section ~: 
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n 


Же сыз п) mutually exclusive and exhaustive categories. Let t= У 1, 
ӛзі 


the total number of categories. Let фу; denote the probability that an experi- 
mental unit falls into category / on attribute j, and let pjz.xm denote the joint 
probability that it falls into both category / on attribute j and category m оп 
attribute A (/=1, ..., 75; m—1, 75; j=1, n; Ё=1,..., n). 

Assume that there exists a vector of random variables x' = [x;, ..., Ху], with 
a joint distribution over the population of experimental units. The variables 
Хр...) Xy are to be referred to as latent traits. Let pj; | x denote the conditional 
probability, given x, that an experimental unit falls into category / of attribute /, 
and let рут | x denote the joint conditional probability, given x that an 
experimental unit falls into both category / on attribute j, and category т on 
attribute k ([—1, ..., rj; m=1, s Thy JH b о 5 k=1, "). 

Note that, necessarily, 


Bign-pn (l=) 
=0 (Im) (1) 
апа 
ЖЕТІ | х=ру | х (1=т) " 


=0 (lzm;j-l,...,n). 
Define a partitioned vector 
р-іІр» ы; pa] Ри» э Pir : Ese : Puy о Рат) 

and n? matrices Ру, each of order 7j X rz, in which the (/, m)th element is equal 
to руат (J=1, -o n; k=1, -o п). Define also the supermatrix P= [Р;1], of 
order t x ?, and the matrix 
C=P-pp’. (3) 
One may construct a partitioned random vector 

y =y өө yn =D ӨЗ Vary : et : Упр ез ТІГІН 
such that if an experimental unit falls into category / on attribute j, then уур= 1, 
and yg, =0 (т 1; 1=1,...› n;j-L.4) It then follows that 


р=Е{у} (4) 
Р=Е{уу'} (5) 
С=Е{уу'}-Е{у}Е{У' )- (6) 


his is that one may give the following development in 
or analysis of the components of y, or 
The latter course will be 


A consequence of t 
terms of either a generalized common fact { 
1n terms of a generalized latent structure analysis. 


followed 
nous attributes, a variety of models of the 


As in the special case of dichoton | 
latent structure type may, in principle, be constructed. It will suffice to consider 


the linear model, 


ри | х= 5 fnsxs t an (1-1, но 11371, 8): (7) 
1-1 


168 Roderick P. McDonald 


The origin and scale of each x; may be so chosen that 
E{xs}=0, 
and ` (8) 
E{x}=1 (s=1, ..., m). 
For simplicity, an orthogonal model will be assumed, such that 9) 
E(xx')-I. ie 
Form the matrices F; (71, ..., n) each of order 7; xr, in which the (1,5) 
element is fjzs, and the supermatrix, 


Е, } 
Pel. ; 
( Fn 
ofordertxr. Assume that F is of rank r, and that 10) 
Pn кт | х=(фл | x) (Pim |х) (js). ( 
From eqns. (7) and (8) it follows that 
Pnan (71, ..., 7; j=1, өт). (11) 
From eqns. (7), (8), (9), (10) and (11) it follows that | 
ы ; 2 
Pn ат = Y, fnsfims--Ditbim (js R), (12) 
=1 


which may alternately be expressed in the form | 
, I ys (13) 
Ру рург =FjFx' (j#R), 
whence it follows that there exists a txt matrix D of the form 1 
Dy 


D= 


Dan 
ез?) such that 


where Dj; is of order "n xri (j=1,. 4 
C-D-FF софа 

= чей 

is of rank у. Hence it might appear that the parameters of the model (7), 8 в 


si 
7, P and p, can be determined by an application of the generalized factor ed 
procedure described in the previous Paper, as for the case of groups of V^? 
(In practice, the value of r could be guessed, and the algorithm ар 
analogues of P and p obtained from a larg 
eqn. (7) that 


plied 19 
from 
: vs 

е sample.) However, it follow 


(15) | 


2) "Hor " 
1- 2 Pit | х= УХ 2 Siists+ Y аң G-1, € n) | 
- A 


1=1=1 
‘| 


The Common Factor Analysis of Multicategory Data 169 


f ч фы. ара 
or every value of xj, ..., Ха. Since it 1s also the case that 


fi fj 
У an= > pn=1 (j=1, aas A) (16) 
: 1=1 1=1 
it must be, then, that 
tj 
У fns-0 71 жеу тезі; sis) (17) 


1=1 
ог, what is the same thing, that 

F/1-0 (/-1,..., т), (18) 
ose elements are all unity. 

e problem here considered is formally 
ups of variables in the sense of the 


where 1 is the (гу x 1) vector wh 
__ Thus, eqns. (13) and (18) show that th 
identical with that of factor analysing gro 
Previous paper (McDonald, 1969), except that the required matrix F is subject 
to the » constraints (18). It would seem reasonable, given C, to modify the 
algorithm previously described, in order to yield F and D such that these con- 
straints shall be satisfied at each stage of the iterative procedure. (A parallel 
device has been used by Horst (1961), on a different problem.) The proposed 


algorithm, then, is essentially that for the case in Section 3.3 of McDonald 
t Fj be the jth submatrix of the 


(1969), with the following modification. Le 
first approximation to F given by eqn. (19) in McDonald (1969). ‘Then replace 
this by the submatrix 
(19) 
nity (j=1, ---» ”) and 


That is, at each stage the approximation to 
18) are satisfied, by subtracting from the 


p, = (1—77 LYE), 
Where 1 is the (ғу х 1) vector whose elements are all u 
similarly proceed at each iteration. 


F is adjusted so that the constraints (1 
elements of each column of a submatrix, the mean of those elements. 


At first sight, such modification might not seem necessary. If F, D are 
given such that eqn. (14) is satisfied, it follows from eqn. (13) that 

FE; 1- [Pj = pipi 11 - 0 (72%). Е (20) 

20) сап Бе satisfied if and only if eqn. (18) is satisfied. 

the matrix D obtained will only approximately 


For r sufficiently small, eqn. ( 
riate to employ an algorithm in which eqn. (18) 


However, since in practice 
satisfy eqn. (14), it seems арргор 
is satisfied at every iteration. 
As in the classical common factor model, 
F satisfies eqn. (14) equally, and may replace 
may be replaced, if desired, by any non-sing ; 
lative relaxation of the assumption (9), by the usual procedu 
obti i f 4. PE 
ب‎ mes the model (7) serves to account for the joint инер 
Occurrence of categories of distinct items, with a тшш, number E Areas 
traits (one or more), which may also be regarded as common "ri is 5 іп 
ntrast to principal component analysis, which in the case of multicategory ES 
S.P. 


any orthonormal r x r transform of 
F in the model. More generally, 
ular r x r transform, with a corre- 
ires for obtaining an 
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requires in general a number of components equal to the total number of 
categories, less the number of attributes. 


3. AN EMPIRICAL EXAMPLE 

Burt (1950) illustrates his discussion of the principal components of multi- 
category data, with an example drawn from anthropometric research. The 
basic observations, on a sample of 100 males, were of (1) hair-colour, classified as 
fair, red or dark, (2) eye-colour, classified as light, mixed or brown, (3) head- 
shape, narrow or wide, and (4) stature, tall or short. Thus there is a total of ten 
categories, for four attributes. Тһе observed joint frequencies are reproduced in 
Table 1. This table yields the matrix C (eqn. (13) ), given in Table 2. The 
constraints on row and column sums of submatrices are readily noted. Table 3 
indicates the behaviour of the eigenvalues of C-D through 20 successive. 


TABLE 1. OBSERVED FREQUENCIES 


Hair 
Fair 22 0 0 14 6 2 14 8 13 9 
Red 0 15 0 8 5 2 11 4 10 5 
Dark 0 0 63 11 25 27 44 19 20 43 
Eyes 
Light 14 8 11 зз 0 0 27 6 29 4 
Mixed 6 5 25 0 36 0 20 16 10 26 
Brown 2 2 27 0 0 31 22 9 4 27 
Head 
Narrow 14 11 44 27 20 22 69 0 30 39 
Wide 8 4 19 6 16 9 0 31 13 18 
Stature 
Tall 13 10 20 29 10 4 30 13 43 0 
Short 9 5 43 4 26 27 39 18 0 57 


» Le. latent traits prescribed, 
t each iteration. As in the 


efficients aj, (=1, ..., 7; 7=1, Ме: n 
; for comparison, are the corresponding 
ing point of the iterations), adjusted to 
Table 5 gives the residual 
matrix, СЕЕ. Elements in the off-diagonal submatrices are expected to be 
‘ small ’ in comparison with the elements of C. 
The same data were submitted to GENPAX, without adjustment of the 
submatrices of F to satisfy the constraints indicated py eqn. (18). After four 
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iterations, the results so obtained agreed with the results from the constrained 
version, to the four decimal places printed out, with discrepancies at the first 
iteration of less than 0-0007 in the elements of F. 


Iteration 

0 “77 
2 0:63 
4 0-61 
6 0-61 
8 0-61 
10 0-61 
12 0:61 
14 0-61 
16 0-61 
18 0-61 
20 0-62 


Hair 
Fair 
Red 
Dark 

Eyes 
Light 
Mixed 
Brown 

Head 
Narrow 
Wide 

Stature 
Tall 
Short 


0:47 
0:37 
0:37 
0:37 
0:37 
0:37 
0:37 
0:37 
0:37 
0:37 
0:37 


0:31 
0:07 
0:05 
0:04 
0:03 
0:03 
0:03 
0:02 
0-02 
0:02 
0:02 


рі 
0:22 
0-15 
0-63 


0-33 
0-36 
0-31 


0-69 
0-31 


0-43 
0-57 


'TABLE 3. 


0-28 
0-02 
0-01 
0-01 
0-01 
0-01 
0:02 
0:02 
0-02 
0:02 
0:02 


0:17 
0:00 
0-00 
0:00 
0:00 
0-00 
0:00 
0:00 
0:00 
0:00 
0:00 


EIGENVALUES 


0-11 
0:00 
0:00 
0:00 
0:00 
0:00 
0:00 
0-00 
0-00 
0-00 
0:00 


0:00 
0:00 
0:00 
0:00 
0:00 
0:00 
0:00 
0:00 
0-00 
0:00 
0-00 


'TABLE 4 


Factor pattern 


— 


0-00 
0-00 
0-00 
0-00 
0-00 
0-00 
0-00 
0:00 
0:00 
0:00 
0-00 


Criterion 
0:00 000 002723 
-0:01 -0:03 0:00078 
—0:01 -0:04 0-00051 
—0:01 —0:03 000038 
—0:02 —0:03  0:00030 
—0:02 —0:03 000026 
—0:02 —0-03 0:00023 
-0:02 —0-03 000022 
—0:01 —0:03  0-00021 
—0:01 —0:03  0:00020 
—0:01 —0-03  0-00020 


Principal components 


A fa I п 
031273 -0:0587 0:1819 —0-0694 
0:0848 —0-0107 031143 —0-0106 

—0-2121 0:0694 — 0:2963 0:0800 
0:4630 | —0:0137 0:4044 0:0906 
—01753  —0:0742 -0:1551 -02554 
—0:2877 0:0878 — 0:2493 0:1648 
0:1025 0:4111 0:0504 0:4218 
—0:1025 —0-4111 —0:0504 -04218 
0:3151 -0-0676 0:4367 -0-0460 
—0:3151 0:0676 —0:4367 0:0460 


4. DISCUSSION 


It is evident that the algorithm described can yield a plausible solution to the 
problem of fitting a linear latent structure model, or quasi-factor model, to 
multicategory data. Problems of statistical estimation and hypothesis-testing 
remain to be explored. 

An obvious limitation of the model lies in the fact that there are bounds on 
the values of the latent traits or factors beyond which the model implies 
probabilities for the categories of each attribute that lie outside the permissible 
limits. There is no difficulty, in principle, in combining the above treatment 
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with the non-linear (polynomial) factor analytic techniques described by 
McDonald (1967a, b, c). 


In spite of its limitations, the model yields a first approximation кы 
of multicategory data that involves no loss of information about individua 
categories. It thus becomes possible, for example, in work on test M ema 
to judge the merits of answer categories in multiple choice questions, in terms О 


Da c de e i д Ч TM x 
their discriminating power on one or more dimensions of the trait comple: 
under study. 


Finally, it should be mentioned that a treatment of the present problem was 
promised by McDonald (1968), conceived as a further consequence of the 
general theory of weighted combinations of variables, as then developed. 1 he 
hope was that a generalized form of canonical factor analysis would prove possible. 
At the same time, it was recognized that difficulties could be expected, due 
essentially to the fact that canonical factor analysis is not applicable to singular 


‘ covariance ’ matrices. The use of a principal factor analysis algorithm seems 
to avoid these difficulties. 
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THE NORMAL SCORES TEST FOR THE 
c-SAMPLE PROBLEM 


By MARYELLEN MCSWEENEY 
Michigan State University 


and DOUGLAS PENFIELD 
Rutgers University 


Two forms of a c-sample normal scores test are presented as alternatives to 
the Kruskal-Wallis test (KW). The normal scores test statistics and multiple 
comparison procedures are illustrated for large samples requiring a correction for 
tied observations. "Theoretical and empirical evidence is cited in support of 
using the normal scores test in preference to a rank test such as the KW. The 
normal scores test is asymptotically more efficient than the KW for samples from 
common non-normal populations (e.g. uniform and exponential) and is less 
sensitive to non-normality and unequal variances than is either the F-test or the 


KW test. 
1. INTRODUCTION 


Тһе behavioural scientist frequently uses analysis of variance to decide 
whether sample differences in central tendency reflect true differences in the 
parent populations or are simply chance variations among random samples from 
the same population. Analysis of variance is appropriate if the assumptions 
of normality, homogeneity of variance, and independence of the errors can be 
satisfied. When the normality and/or homogeneity of variance assumptions 
for the F-test are suspect, or when only the ranks of the observations are known, 
non-parametric tests can often be substituted for the parametric procedures. 

The Kruskal-Wallis test based on ranks is a frequent non-parametric 
substitute for the F-test in a one-way analysis of variance. When they pro- 
posed this rank test, Kruskal & Wallis (1952) hypothesized that a test statistic 
using normalized observations in place of ranks would approach the x? dis- 
tribution more rapidly for small to moderate sample sizes than does the Kruskal- 
Wallis test. For such normalized observations the ‘ normal scores’ test has 
been developed for the two-sample location problem (Hoeffding, 1951; Terry, 
1952; Van der Waerden, 1953); however, much less attention has been given 
to the c-sample problem. Kendall & Stuart (1961) alluded to a c-sample normal 
scores test but did not indicate the test statistic. Hájek & Sidák (1967) stated 
the test statistic for a c-sample normal scores test which employs expected normal 
order statistics (Terry—Hoeftding form), and an asymptotically equivalent test, 
which uses inverse normal statistics in place of ranks (Van der Waerden form). 

Although Hajek & Sidák are apparently alone in their presentation of the 
c-sample normal scores test, Puri (1964) has developed a generalized c-sample 
test of which both the normal scores and Kruskal-Wallis tests are special cases. 
Puri has shown that the observations in a c-sample test may be rep aced by any 
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convenient numbers, and he has derived the conditions for joint asymptotic 
normality of the mean vectors of the transformed observations, Puri's deriv = 
tion of the Pitman efficiency of the Kruskal-Wallis test relative to the EM 
c-sample test makes possible the comparison of competing c-sample tests. The 
asymptotic results of Hodges & Lehmann (1961) and of Puri suggest that a 
normal scores test will be more efficient than the corresponding rank test when 
the samples are drawn from distributions having abrupt tails (e.g. uniform), but 
that the tests will be almost equally efficient when the samples are drawn from 
bell-shaped distributions (e.g. normal, logistic). . 
This paper presents the normal scores test statistics and the associated 
multiple comparison procedures using both expected normal order statistics 
and inverse normal statistics. An example dealing with the effect of environ- 
mental noise on the performance of a complex learning task (Hays, 1963) is used 
to illustrate the normal scores test. Theoretical and empirical results com- 


paring the normal scores test and rank tests with respect to validity, Monte Carlo 
power, and robustness to scale alternatives are cited. 


2. RATIONALE FOR AND DERIVATION OF THE C-SAMPLE NORMAL 
Scores Trst 


Assume that ¢ independent random samples 


Хи» өз Xin) Хору very Vong. Xp. 
have been drawn from populations with conti 
functions, F(x) (i—1, 2, ..., с) respectively. 
хар its rank order in the entire sample of N 
associated with the correspondin 
the observations to the ranks. 


э Xone 


nuous cumulative distribution 
Associate with each observation, 
observations. "Тһе ranks may be 
g observations by assigning the subscripts of 


Thus ғу would be the rank of the jth score in 
the ith sample and x€;? would denote the ryth smallest observation in the 
sample of N observations. 


The hypothesis of stochastic equality of the c populations 


Ho: Р(х) =Ё„(х)=... = Рах) 
is to be tested against a shift alternative 


Ну: Р(х) = К(х— 8) 


where not all 8; are equal. If the test statistic is to be applicable to ordinal data 
then the monotonicity of the observations must be preserved, but the original 
observations need not be retained in the test Statistic. If the ranks, rather than 
the original observations, are taken as the data, then one may reason from the 
observable ranks to some hypothetical distribution from which the ranks could 
have been drawn. For convenience let us assume that the observable ranks 
correspond to a single hypothetical random sample of size N from a standard 
normal distribution, ®(x). If ry is known, then the best guess as to the value 
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of the corresponding observation from the hypothetical standard normal dis- 
tribution would be the expected normal order statistic, E[xC;p]. Е [х0] is 
the expected value of the rth smallest observation in a sample of size N from a 
standard normal population. It is the value of x to be expected, ол the average, 
if the ryth smallest observation were selected repeatedly from samples of N 
observations from a standard normal distribution. Thus the ranks of the 
observations, ууу, are replaced by the corresponding expected normal order 
statistics, E[x(7;?], under the assumption that a set of N observations from a 
hypothetical standard normal distribution, Ф(х), would have the configuration 
of ranks actually obtained in the sample from unspecified distribution, F(x). 

Exactly the same type of reasoning that applied to the expected normal order 
statistic applies to the statistic using the inverse normal distribution. As before, 
ry denotes the rank of the jth observation in the ith sample and ®(x) is the 
standard normal distribution function. Since the values of Ф(х) necessarily 
lie in the unit interval 0 < D(x) < 1, this interval can be subdivided into N--1. 
equally spaced subintervals marked by the endpoints: 

0, 1(N +1), 2/(N +1), ..., (CN — D/ON +1), N/(N + 1), Т. 
The ranks, rj, assume the values 1, 2, ..., (N — 1), N, and therefore this sub- 
division can be accomplished by finding P(x") —ri/(N + 1) for all ri. If 
Ф) =ry/(N +1), then xti? -Ф-Чғу/(М--1/. 

Thus N normalized scores x("jj) -Ф-Чғу/(Х--1)| are obtained from the inverse 
normal distribution. Their rank order is the same as that of the actual observa- 
tions. Once again the observations from unspecified distributions, or their 
ranks, have been replaced by normalized observations having the same con- 
figuration of ranks. - 

The test statistics based on E[x(7;)] and © [rij/(N + 1)] are asymptotically 
equivalent and structurally identical; thus a single derivation will suffice for 
both. Let әу- [0:0] for the expected normal order statistic and 


шу-Ф-Че//(М--1) for the inverse normal statistic. Let wi, = У) wini 
j 


and note that w.. = УУ шу | N =0, since УУ E[x'r;?] =0, while 
xx O-(ryl(N+1]=0. Consequently SY (ауа) TE wy? is constant 
A fixed N, and X ni(i. —,.)%#= 3] Ку For mean suena scores based 
on equal sample sizes n it can be shown that 

Щш)= B= -0 


and that 
2 
1j. Зеба. wi Eres N- 
var(wi.) = 212 var(w) + У У соч d m: кі) 
jam 


180 Maryellen McSweeney and Douglas Penfield 


If the finite population correction is omitted, and the equal sample size restric- 
tion is dropped, 
x w 
i 
var(w;.)= 3, .س‎ 
(wi-) > А 


The usual partitioning of У У, (10i; — vw. )* yields 
tj 


УУ (wy = wi) == > (gw. +E nui. — w,,)*. 
Thus the ordinary F-ratio 
Enn. y | (e-1) 
Fei 
ir (wy wy fw - c) 
is a monotone-increasing function of 
> nii. — 0, .)* 
and, in turn, a monotone-increasing function of 
У ni(w.—w,,)? 
Multiplication of the last ш by N gives 
Xni. — v, y 


MS (ror. — E(w.)‏ ےو 


> У wy? ia wis? [niN E i var(w; ) ы 
i ; 


а statistic which is asymptotically distributed as y? with c—1 degrees of 
freedom. Multiplication of this statistic by (N — 1)/N will yield the statistic 


(pe) 


т 


«Ойыс ынны 
OX ЫСТЫ 
і ij 


(N-)Xmw* (N-1)X 


cie is also asymptotically distributed as x? with c—1 degrees of freedom. 
е advantage of W over other asymptotically equivalent random variables оп 


which the test of stochastic equali i i null 
: ic equality might be based is that under the 
hypothesis, E(W)— EG? with c—1 degrees of freedom) —c — 1. 
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'The corresponding expressions for the test statistic in terms of expected 
normal order statistics and inverse normal statistics are, respectively: 


(N-1) X (> E) [ы 


W= 1—2 = 
УУ (Gp 
ij 
and 
Ti 
-1 xo 
w-2 [жеу] /* 
5 ту NY ` 
Ud att 2828 
M (ғ% )] 
‘This derivation parallels Kruskal's (1952) derivation of the Kruskal-Wallis 
statistic. Marascuilo's (1966) multiple comparison procedures based on 
chi-square can be used to derive post hoc comparisons appropriate for the normal 


scores test. . Р 
In the c-sample problem, the parameters of interest аге the population mean 


normal scores, E(w;.), and their corresponding sample estimators are the mean 
normal scores, w;. When mean normal scores are based on equal sample 


sizes, 7, 
X E(wu) 
Е(ш.)= س‎ =0, 


EE w; 
Э N-n 1 
уаш.) = — N (уч) ^ N(N-1) Б > w) (=1), 


and it can be shown that 


cov(w., y.) cov | ~—, —— | =- тт” 
ik 8 п N(N-1) 


Lwy X - = Dd wy? 
A КУ 


Contrasts between the mean normal scores, w;., are of interest. Let Y —Y aw. 
i 
be any arbitrary contrast in the w;. such that X a;=0, and let Y denote the 


1 
corresponding contrast in the population parameters. Marascuilo (1966) has 
used large sample theory to define a set of (1— «)%, simultaneous confidence 
intervals for the W, showing that in the limit, the probability is (1 — о) that 
simultaneously for all linear contrasts of the form iM 


V — Ge аха (Ру) «Y «Ne Ge) (vary, 


where x3e_, is taken at the (1— a)% level. 
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For mean normal scores, Y = У, a; wi., 
i 


E(¥) =X аЕ(шң.) =0, ы; 
while 
уаг(Ф) 2X aj? var(i.) + X У аак COV(ztj., wz.) 
i i*k ` 
Ld wy? mee УУ wy? 
= 21 j y = ^ EX d 
ine ا‎ i) ae 
La? YY wi? 
a $i 
H Nep? 
since 


2 
0- (= a) = a® +3, X aap, and ХХааф--У ад. 
і і ik ik i 


Thus the corresponding inequality in terms of normal scores defines the 


set of simultaneous (1-о)% simultaneous confidence intervals which could be 
written for contrasts in the wi.: 


хай ZE mpi 
У agn. (201) АЕ > wT < Y «Y uw. + 3s). 
i 


where x?; , is taken at the (1- «)%, level. 
As is true of the F-test, the c-sample normal s 


rom zero. Thus inspection of the 
is significantly different from 

or as a means of identifying those ) 
“ПОП as measured by the mean normal scores. 2 


) ; Ен » that Hy has been rejected without 
cause, since the entire set of permissible co. 
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3. EXAMPLE OF THE USE OF THE C-SAMPLE NORMAL Scores TEST 


The following example illustrates the large sample use of the normal scores 
test and the problems associated with the occurrence of tied observations. In 
this hypothetical example (Hays, 1963, p. 548), performance on a complex 
task is reported for six different levels of noise intensity, ten subjects being 
allotted to each level. 'l'able 1 gives the ranks of the scores and shows the 
frequent occurrence of ties. The hypothesis tested is that the populations of 
scores for the six different noise intensity levels are stochastically equal. 
Rejection of the hypothesis of stochastic equality and the use of post hoc multiple 
comparisons merely identify the differences in location among the distributions; 
they do not interpret these differences in terms of common measures of location. 


RANKS or SCORES ON A COMPLEX PERFORMANCE TASK UNDER SIX 
NOISE INTENSITY LEVELS 


(Hypothetical example taken from Hays (1963, p. 548)) 


TABLE 1. 


I п III 
(10, 11) (27, 28, 29) . (36, 37, 38, 39) 
(14, 15) 32 42 
(18, 19) (36, 37, 38, 39) (46, 47, 48) 
(20, 21) (40, 41) (49, 50) 

24 43 51 
(25, 26) (44, 45) (52, 53) 
(30, 31) (49, 50) 56 
(34, 35) (52, 53) 57 
(36, 37, 38, 39) (54, 55) (58, 59) 
(46, 47, 48) (58, 59) 60 

IV у VI 
(18, 19) (5, 6) 1 
(22, 23) 8 2 
(27, 28, 29) (10, 11) 3 
(30, 31) (12, 13) 4 
33 16 : (5, 6) 

(34, 35) 17 7 
(40, 41) (20, 21) 9 
(44, 45) (25, 26) (12, 13) 
(46, 47, 48) (27, 28, 29) (14, 15) 
(54, 55) (36, 37, 38, 39) (22, 23) 


Both the Terry-Hoeffding and the Van der Waerden forms of the test 
statistic require a knowledge of the ranks of the observations. The Terry- 
Hoeffding form uses tabled values of the expected normal order statistics in 
order to replace each rank by the corresponding expected normal order statistic 
(Table 2). Extensive tables of E[x(r;))] and >> (E[x':)])? аге available (Harter, 


4,2 
1961; Owen, 1962). Тһе Van der Waerden form does not require special tables 
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since the user may compute the inverse normal statistic from any standard normal 
table by finding the value of Wp d "['a/CN--1)] which has a cumulative 
probability of occurrence of Tyl(N+1) (Table 3). 
involves linear interpolation so а Probit table ( 
preferred to linear interpolation in a standard 


Such a procedure ordinarily 
Fisher & Yates, 1938) may be 
normal table. 


TABLE 2, EXPECTED NORMAL ORDER STATISTICS 


I п ш IV V VI 
— 0-9634 — 0-1043 0-2960 —0:5225 -1:3764 -2:3193 
-0-7252 0-0625 0:4986 — 0:3395 —1:1444 —1:9352 
—0-5225 0-2960 0-7530 — 0:1043 — 0:9634 — 1:7162 
— 0:4292 0:4292 0-8991 0:0000 — 0-8383 — 1:5574 
— 0:2740 0:5464 0-9966 0-1042 — 0-6459 -1:3764 
— 0-2097 0-6206 1۰1060 0:1673 — 0:5954 — 1:2287 
0:0000 0:8991 1۰4302 0:4292 -0-4292 — 1:0676 
0:1673 1:1060 1:5574 0-6206 — 0:2097 — 0:8383 
0-2960 1-2757 1:8257 0-7530 —0:1043 -0:7252 
0:7530 1-8257 2:3193 1:2757 0:2960 — 0:3395 
Sums of squares for expected normal order statistics 
X (s неді) —402:0283 йж С = 573048 
i 1 ij 
--1 > A E 2 wr.) 
ia ddind- (2 mii _ 59 (402-0283) 


REY (Elx; 10 573048) = 41:39 
із 


TABLE 3. INVERSE NORMAL STATISTICS 
I П III IV у VI 
— 0:9468 —0-1030 0-2920 -0-5159 —1:3424 — 2-1444 
— 0:7130 0:0627 0:4930 — 0:3346 —14217 — 1:8384 
— 0:5159 0-2920 0:7407 — 0-1030 — 0-9468 —1-6546 
— 0:4235 0:4235 0:8839 0:0000 — 0:8242 — 1:5063 
— 0:2715 0:5388 0:9782 0۰1030 —0-6372 — 1:3424 
— 0:2070 0-6115 1:0834 0-1650 — 0-5858 — 1:2004 
0:0000 0:8839 1:3917 0:4235 — 0:4235 — 1:0450 
0:1650 1:0834 1:5063 0:6115 — 0:2070 — 0:8242 
0:2920 1:2467 1:7465 0:7407 — 0:1030 — 0:7130 
0:7407 1:7465 2:1444 1:2467 0:2920 — 0:3346 


E 
Sums of squares for inverse normal Statistics 


Eimxpa[ "€ у _ 375-4826 Sis cf an b зе 
zf: (ғ) — 510 (55) = 53-1797 


y N+1 


(М-1) 2 {se ( fy ур 
ЗА ils N-1/J 59(375-4826) 


5 - 41-66 
nE zx ӊ ) 10 (53-1797) 
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The statement of the null hypothesis in terms of the distribution functions 
implies that the samples have been drawn from continuous distributions. This 
assumption precludes the occurrence of tied scores; however, in the example 
there are many pairs of observations tied for the same rank. Two procedures for 
dealing with ties are common in non-parametric statistics. Ties may be broken 
at random. If the observations х;у and хр» are tied for the same ranks, say 
(10, 11), random breaking of ties implies that Pr(rank «j= 10) = Pr(rank xz, = 10) 
апа Pr(rank x;j— 11) = Pr(rank хь» = 11). This technique of breaking ties at 
random may be extended to multiple observations having the same ranks. The 
technique has the advantage of retaining the original integer values of the ranks 
so that tabled values of У У (E[x'7;?])? may be used for the test statistic, and 
the test statistic will be distributed asymptotically as y? with c—1 degrees of 
freedom. On the other hand, the procedure introduces extraneous randomiza- 
tion. Not only may this randomization affect the computed values of the test 
statistic, but it may also cause two experimenters analysing the same data to 
arrive at different conclusions. 

Instead of breaking ties at random, the experimenter may wish to compute 
an average statistic for each tied score (Van der Waerden and Nievergelt, 1956). 


For example, either 


10 11 
ipE[x 09] + (хал or ЦЦез (яза) +p- Gr] 


would be associated with each observation having tied ranks (10, 11). This 
< mid-score ' procedure extends to multiple observations tied for the same ranks 
and was used to obtain the values given in Tables 2 and 3. ‘The use of mid- 
scores, however, requires a separate computation of У X(E[xC;?])?, since the 
tabled values do not correct for the presence of ties. Though the ‘ mid-score ’. 
procedure is more tedious than the random breaking of ties, it does not employ 
randomization which is extraneous to the experiment. 

Тһе hypothesis of stochastic equality of the populations of scores for the 
six noise intensity levels is tested against a location alternative by both the 
'erry-Hoeffding and Van der Waerden forms of the normal scores test. ‘ Mid- 
scores’ are used for both forms of the test. For the Terry—Hoeffding form the 
test statistic, W, equals 41:39 (Table 2), while the Van der Waerden form yields 
W —41:66 (Table 3). If the test of stochastic equality is performed at a = 0-05, 
the hypothesis of equality is rejected in favour of a location alternative since W 
exceeds y?; = 11:071. 

It can be shown (Table 4) that the trend of the normal scores for the noise 
intensity levels has both a linear and a quadratic component, but not a cubic 
component. ‘These results are consistent with the parametric analysis of average 
trends (Hays, 1963) and the non-parametric analysis of trends in rank means 
(Marascuilo & McSweeney, 1967); however, post hoc analyses in terms of normal 
scores lack the easy interpretability of analyses in terms of means. 


S.P. M 
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1 
TABLE 4. SUMMARY STATISTICS FOR ORTHOGONAL Comparisons WITH MEAN 
NORMAL SCORES By NOISE INTENSITY LEVEL 


Noise intensity level I Il Ill IV V VI ' 
Sample size 10 10 10 10 10 10 
Mean expected normal | 
order statistic —0:19077 0-69560  1-16819 0:23837 —0-60110 — 1:31038 
Mean inverse normal 
statistic —0:18800  0-67860 1-12601 023369 —0:58996 — 1-26033- 


Orthogonal polynomials 
Linear trend (YL) - 
Quadratic trend (Fo) 
Cubic trend (с) - 


-3 - 1 3 
-1 -4 -4 = 


7 4 -4 -7 


л бл л 
лол 


Confidence intervals for mean expected norm 


al order statistics and mean inverse normal 
statistics using the mid-score correction for ti 


es è 
Expected normal order 
-1909<%.<-1.74 
-2273<%0<-3-72 
= 671«'Yc«2141 


Inverse normal 
—18:42 < YL < — 1-70 
—21:92 < Wo « — 3:61 
= 631 < Yc «20:49 


4. COMPARISON OF THE KRUSKAL-WALLIS AND NORMAL Scores "Гввтв 


4.1. Introduction 


ge of being more tedious to 
allis statistic. Mor 


ics also increases the computation 
by the use of a c-sample normal scores test. 


Despite these disadvantages, the normal 
as a competitor with the Kruskal-Wallis test when the analysis of variance 
assumptions have been violated or when the data аге ordinal. 


test is asymptotically at least as efficient as the F- 
Kruskal-Wallis test when the sam 


Scores test still merits consideration 


test and more efficient than the 
ples have been draw 


n from some commonly 
rmal, uniform, and 


i, 1964) €xponential distributions 
(Puri, А 


42 V alidity 4 
Although the asymptotic properties of the c-sample normal scores test can 
be derived by considering this test as a Special case 
statistic, little is known of the behaviour of the 
situations involving small sample sizes and three 
Theoretical and empirical studies of the tw 


: ы 9-sample normal scores test give 
some indication of the relationship to be 


expected between the asymptotic 


Тһе normal scores 


nono 
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properties of the c-sampletest and its corresponding small-sample characteristics. 
‘The quality of the asymptotic approximation to the null distribution of the 
normal scores test statistic was studied by Klotz (1964) for the case of two 
samples of equal size (п-5 (1), 10). Although the normal approximation to 
the distribution of the normal scores test was generally good, it was subject to 
large percentage errors in the tails of the distribution and was only slightly 
better than the normal approximation to the Wilcoxon test statistic. 

An empirical study of the validity of the Kruskal- Wallis and c-sample normal 
scores test for three samples of equal size (n=5, 6, 8, 10, and 12) from normal 
and uniform distributions (Table 5) did not detect pronounced differences in 
validity between the ''erry-Hoeffding and Van der Waerden forms of the normal 
scores test or between the normal scores and rank tests (McSweeney, 1967). 
‘The Kruskal-Wallis and normal scores test were both slightly conservative for 
small sample sizes. Moreover, in those cases in which the Kruskal-Wallis and 
normal scores tests did differ in the size of the discrepancy between nominal a 
and estimated actual « the normal scores test was usually the more conservative 


of the two tests. 


Тәзіке 5. ESTIMATED ACTUAL a FOR THE KmuskaL-WaLLis (Н), ‘TERRY 
HoEFFDING (Wapu) AND VAN DER WAERDEN (Wy) Statistics BASED ON 
SAMPLES OF SIZE п FROM STANDARD NorMAL AND UNIFORM POPULATIONS 


Nominal «= 0:100 a= 0-050 а= 0:025 х= 0:010 
oT | ea em 
H Worn Wyw H Ити Шу H Wr Wyw H Wr Wyw 
Samples from the standard normal distribution 
0:095 0-100 0:099 0:050 0:042. 0:044 0-016 0:013 0014 — 0-003 0-001 0-002 
0-084 0-082 0-084 0:037 0-034 0-036 0-015 0-014 0:014 — 0-004 0-002 0-004 
0:095 0:095 0:094 0-045 0:042 0:041 0017 0-019 0:019 — 0-006 0-005 0-005 
0-100 0-106 0-106 0-046 0:042 0:043 0027 0-026 0-026 — 0:010 0-010 0-010 
0-107 0-100 0:102 0:047 0-045 0045 0023 0-019 0-021 0:009 0-007 0-008 


Samples from uniform distribution on the unit interval 


0.090 0-093 0094 — 0-044 0-036 0:036 0-015 0-014 0-014 — 0:001 0-000 0-000 
0:102 0-101 0:100 0:056 0:046 0:049 0-026 0-020 0:021 0-001 0-000 0-000 
0-101 0:091 0:093 0:037 0-041 0-040 0-015 0-014 0-015 — 0-006 0-003 0-003 
0107 0100 0103 0056 0050 0053 0025 0019 0020 0005 0-006 0-006 
0-105 0-11 0-108 0:052 0-051 0:051 0-019 0-020 0-018 — 0-008 0:003 0-003 
Kruskal and Wallis’ conjecture that statistics based on normalized observa- 
tions would approach the asymptotic distribution more rapidly than would the 


statistics based on ranks is not substantiated by these results. "Тһе sampling 
distributions of the Kruskal-Wallis test and the two forms of the normal scores 
test appear to be equally well approximated by the y? distribution with e-1 
degrees of freedom. ‘I'he normal scores test gains little advantage over the 
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Kruskal-Wallis test from the initial normality of the expected normal order 
statistics and the inverse normal statistics, if the vector of mean ranks is already 
approximately multivariate normal for samples of only five observations per 
group. The goodness of fit of the x? distribution is not impaired by using small 
sample sizes or by sampling from uniform rather than normal populations. 
Although an increase in the number of samples (c» 3) is unlikely to change these 
results, sampling from asymmetric distributions such as the exponential might 


show the normal scores test to greater relative advantage when compared with 
the Kruskal-Wallis test. 


4.4 Exact and Empirical (Monte Carlo) Power 


Studies of the exact power of the normal scores test have generally confirmed 
the asymptotic results; however, the comparisons of the power of the normal 
scores test and the rank test depend on the sample size and value of the location 
parameter as well as the type of distribution sampled. Klotz (1963) found in the 
matched-pairs normal shift case that the normal scores test was generally more 
powerful than the Wilcoxon test for small values of the location parameter. As 
the sample size or the location parameter increased, the power of the Wilcoxon 
and normal scores tests became very similar, with the Wilcoxon test sometimes 
the more powerful of the two for large shifts in the region of interest. Van der 
Laan (1964) found essentially the same results for the two-sample case when the 
samples were drawn from exponential distributions. However, in the case of 


the uniform distribution, the normal scores test was clearly more powerful than 
the Wilcoxon test for all values of the location parameter. 


The findings of the exact power studies have been confirmed in the Monte 
Carlo studies of the two-sample normal scores test (Thompson, 1966; Van der 
Laan & Oosterhoff, 1965, 1967). Van der Laan & Oosterhoff point out that 
although the asymptotic efficiency of the Wilcoxon test relative to the normal 
scores test is 0-955 against a normal shift alternative, much of this advantage of 
the normal scores test is lost when small sample sizes are used. 

A Monte Carlo study of the c-sample normal scores test and the Kruskal- 
Wallis test (McSweeney, 1967) indicated that the normal scores test has greater 
empirical power than the Kruskal-Wallis test when samples are drawn from 
uniformly distributed populations and the significance levels are moderately large 
(a2 0:05, n>8 observations per treatment; Table 6). The empirical sampling 
distributions of the Kruskal-Wallis and c-sample normal scores tests were 
obtained on the basis of 1,000 samples of 3% pseudo-ra 


E. ndom numbers generated 
by a modification of the Lehmer multiplicative- 


i |Ve-congruential method. Pseudo- 
uniformly distributed values were obtained directly; a transformation of uniform 


variates was used to obtain pseudo-normally distributed values, "Тһе constants 
0, — 8, + 8, were added to the samples to generate samples from populations 
with means p, м- ô, and ш д respectively. The empirical sampling dis- 
tributions were used to estimate the actual о and small sample power of the tests. 
The estimates within any single row (Tables 6 and 7) are based on the same 1,000 
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samples of 37 observations and the same value of the location parameter, 8, thus 
the values reported across a given row are strongly dependent. 'The values 
within a given column and across the two tables do not exhibit this dependence 


TABLE 6. MONTE CARLO POWER FOR THE KmuskaL-WaLLIs (Н), TERRY- 

HorrrDING (Way) AND VAN DER WAERDEN (Wyw) Statistics BASED ON 

SAMPLES FROM UNIFORM DISTRIBUTIONS HAVING Means 0-5, 0.5— 8, AND 
0:5 + 8 RESPECTIVELY 


а--0:050 a=0-025 а--0:010 
8 H Wm үү Н Ин үү Н Wm Www 


0-10 0-100 0-093 0:095 0-043 0-037 0:039 0:011 0:004 0-005 
0:15 0-160 0-154 0-156 0-080 0-069 0-071 0:025 0:013 0:015 
0:20 0:263 0:261 0:262 0-114 0-105 0-108 0:042 0:022 0-026 
n=5 0-25 0-432 0:432 0-431 0:253 0-237 0-240 0-087 0:050 0:067 
0:30 0-639 0-628 0:631 0414 0402 0-408 0:163 0:093 0:106 
0:50 0-996 0-995 0:955 0-966 0:951 0:953 0-738 0-605 0:629 
0:80 1:000 1-000 1-000 1:000 1-000 1:000 0:999 0:997 0:998 


0:10 0:130 0-135 0-133 0-050 0-046 0:047 0015 0:013 0:014 
0-15 0-192 0-209 0:204 0:103 0-099 0-099 0:033 0:026 0:027 
0:20 0-380 0402 0401 0237 0241 0241 0:083 0:075 0:078 
п=6 0:25 0:557 0:582 0:582 0:372 0:376 0:380 0:176 0156 0-161 
0:30 0:759 0781 0-777 0:570 0:582 0:583 0:314 0:291 0:301 
0:50 1:000 1-000 1-000 0:995 0:992 0:992 0-962 0:921 0:930 
0:80 1:000 1-000 1-000 1-000 1-000 1-000 1:000 1-000 1-000 


010 0454 0172 0474 0084 0093 0091 0043 0041 0042 
015 0339 0379 0371 0222 0237 0237 0105 0110 0-109 
0-20 0:529 0:587 0:578 0387 0416 0-40 0208 0224 0:223 
„=8 025 0758 0810 0801 0628 0659 0656 0401 0422 0424 
0-30 0-905 0932 0:929 0828 0-848 0-850 0650 0-669 0671 
0-50 — 1:000 1-000 1-000 1000 1:000 1-000 1:000 0:999 0:999 
0-80 1:000 1-000 1:000 1000 1-000 1-000 1-000 1-000 1-000 


010 0199 0223 0218 04114 0440 0135 0043 0.052 0:052 
015 0:399 0481 0472 0273 0319 0319 04153 0168 0-169 
020 0:655 0718 0711 0535 0-591 0:591 0251 0:399 0:397 
4-10 025 0:863 0909 0905 0770 0817 0814 0602 0650 0-646 
0-30 0:961 0975 0:974 0903 0939 0:934 0:813 04842 0:844 
0-50 1000 1000 1:000 1:000 1-000 1-000 1:000 1:000 1-000 
0-80 1000 1-000 1:000 1000 1000 1-000 1:000 1-000 1-000 


010 0218 0272 0266 04130 0151 04149 0057 0070 0-070 
015 0:519 0581 0:577 0375 0453 0445 0237 0281 0277 
0-20 0:768 0825 0-816 0639 07726 0721 0455 0-534 0:525 
4-12 025 0910 0954 0049 0-851 0-893 0-887 0745 0-800 0-794 
0-30 0:987 0.996 0:995 0-973 0985 0-984 ^ 0-940 0-960 0-959 
0-50 1:000 1-000 1-000 1000 1:000 1:000 1-000 1-000 1-000 
0-80 1000 1000 1-000 1000 1-000 1:000 1-000 1:000 1-000 
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TABLE 7. MONTE CARLO POWER FOR THE KRUSKAL-WALLIS (H), TERRY- 
HorrrDIiNG (Wry) AND VAN DER WAERDEN (Ww) STATISTICS BASED ON 


SAMPLES FROM 


n=8 


n=12 


NORMAL DISTRIBUTIONS HAVING MEANS 0 


H 


0:048 

0:061 

0:081 

0:194 
0:435 
0:648 
0:829 
0:947 
0-989 
0:996 


0-059 
0:058 
0-105 
0:231 
0:555 
0:755 
0:887 
0:984 


0:053 
0:080 
0:137 
0:318 
0:724 
0:889 
0:974 
0:997 


0:057 
0:093 
0-174 
0-412 
0-832 
0:955 
0:996 


0:061 
0:127 
0:208 
0:500 
0:915 
0:988 
1:000 


Won 


0-035 
0-056 
0:076 
0-174 
0-411 
0-620 
0-803 
0:936 
0:986 
0:995 


0:054 
0:058 
0:098 
0:221 
0:542 
0-752 
0:877 
0:981 


0:055 
0:075 
0-133 
0:323 
0.720 
0:891 
0:968 
0-998 


0-063 
0-093 
0-175 
0-411 
0:852 
0-960 
0-996 


0:051 
0:126 
0:195 
0:498 
0:922 
0:986 
1:000 


0+6 RESPECTIVELY 
a=0-050 


Wyw 
0:038 
0:056 
0:076 
0-177 
0-414 
0:622 
0-808 
0:942 
0-986 
0-995 


0:055 
0:057 
0-101 
0-230 
0:547 
0:754 
0-884 
0:981 


0:056 
0:077 
0-135 
0:323 
0:724 
0:894 
0:969 
0:998 


0:063 
0:093 
0:177 
0:405 
0:854 
0:950 
0:997 


0:052 
0:126 
0:198 
0-503 
0-923 
0-987 
1:000 


E 0—8, AND 
«=0-025 «= 0:010 

GNE =. Р ® Ры; 
H Wru Wyw H Wen Wyw 
0-011 0-007 0-009 0:002. 0-000 0:000 
0:026 0-023 0-023 0-006 0-004 0-005 
0-029 0-025 0-026 0011 0-005 0-006 
0-094 0-084 0-085 0-026 0-012 0-013 
0:276 0-246 0-253 0-102 0:055 0-065 
0451 0425 0-432 0195 0420 0-135 
0:657 0-625 0-632 0354 0:237 0:266 
0:845 0-823 0-827 0600 0465 0-500 
9-959 0:953 0-055 0-819 0-723 0:756 
0:990 0-987 0-988 0-887 0-803 0:823 
0-027 0-022 0-003 001 0-008 0:011 
0026 0-024 0-025 0-006 0-005 0-005 
0:062 0-059 0-062 0-019 0:015 0:017 
0132 0426 0:129 0453 9-043 0049 
0:389 0-365 03368 04187 053 04162 
0:620 0-595 0-603 ^ 0.369 0:330 0-348 
0784 0:772 0-75 0:564 0.513 0:531 
0-944 0-944 0-945 0:817 0:792 0801 
0028 0030 0031 0009 0:007 0-008 
0-037 0.034 0-033 0.009 Q.008 0-008 
0079. 0077 0-076 0.027 0.020 0-022 
0212 0210 0-213 0414 0-101 1407 
0589 0-591 0:593 0415 0.391 0-398 
0-809 0:799 0-805 0-642 0:610 0-615 
0934 0936 0.939 0.842 0:815 0.819 
0995 0-995 Q.995 0.971 0-964 0:968 
0025 0004 0-024 0-008 0-008 0-008 
0-047 0048 0-049 0.017 0-018 0-018 
0:107 0-097 0.097 0-056 0-058 0-059 
0284 0284 0.288 0-161 0:149 0-154 
0733 0432 0.732 0-577 0-562 0:569 
0919 0-926 0.926 0-831 0-833 0-834 
0989 0987 0-988 0920 0-972 0-975 
0029 0033 ооз 6012 0042 0012 
0073 0071 0079 0.030 025 0:026 
0120 0-117 0-118 0-056 0:057 0-058 
0:373 0-360 0-368 0-219 0-210 0:213 
0:832 0-844 0-841 0-699 0-698 0-702 
0-968 0-972 0.974 0-928 0:935 0:936 
0999 0-998 0.998 (901 0-988 0-989 


I 
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because they have been derived from independent sets of 1,000 samples each. 
For samples of at least ten observations per treatment, the normal scores test is 
consistently more powerful than the Kruskal-Wallis test for all commonly used 
significance levels («>0-01). Nevertheless, the differences between the 
estimated power functions of the two tests are not appreciable: typically the 
differences in empirical power range from 0-01 to 0-10. If the samples have 
been drawn from normal rather than uniform populations (Table 7) the normal 
scores test is negligibly different from or slightly less powerful than the Kruskal- 
Wallis test for large values of the empirical power (8 > 0-8), irrespective of the 
sample sizes. 
Robustness 


The normal scores test might also be preferred to the Kruskal-Wallis 
test when moderate size samples are drawn from populations having unequal 
variances. Pratt (1964) demonstrated that both the parametric and non- 
parametric two-sample test of location were affected by differences in scale in the 
underlying populations; however, the normal scores procedures were far less 
sensitive to inhomogeneity of variance than were the parametric or rank pro- 
cedures. Ury’s term ‘assumption freer statistics’ (1967, p. 53) may be used to 
characterize the normal scores test in relation to the Kruskal-Wallis test. Even 
in those situations in which the power comparison of the two tests does not clearly 
favour the normal scores test, the normal scores test may be the more appro- 
priate of the two because of its greater robustness with respect to non-normality 


and inhomogeneity of variance. 
5. CONCLUSIONS 

Тһе asymptotic properties of the normal scores test imply that the c-sample 
normal scores test will be more powerful than the Kruskal-Wallis test for 
samples from normal, uniform and exponential distributions, The small 
sample results cited indicate that the comparison is dependent on the significance 
level of the test, the location parameter, and the sample sizes as well as on the 
distributions sampled. ‘The small sample power of the normal scores test is 
clearly superior to that of the Kruskal-Wallis test in those marginal cases in 
which a test at a moderate significance level is used to detect small differences 
in location among non-normal distributions (о> 0-05 and 0-3 < 8 <07). The 
greater insensitivity of the normal scores test to differences in scale among the 
distributions may make it a more desirable test of (осаррп than the less robust 
Kruskal-Wallis test. Should exact tables be developed for the c-sample normal 
scores test, it will become an even stronger competitor with the Kruskal- Wallis 


test in the small sample case. 
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MULTIPLE COMPARISONS IN PSYCHOLOGICAL EXPERIMENTS 


By M. A. AITKIN 


Macquarie University 


A simple relation between error rates using multiple t-tests and Scheflé's 
multiple-comparison procedure is pointed out. Errors in the development of 
some new multiple-comparison procedures are indicated. Ап experimental error 
rate approach using a flexible error rate is advocated. 


1. INTRODUCTION 


Several computer-sampling experiments have recently been carried out to 
examine the relative performances of various multiple-comparison procedures in 
analysis of variance (see, for example, Balaam, 1963; Petrinovich & Hardyck, 
1969). In particular, comparisons of multiple or ‘ unprotected’ t-tests with 
Scheffé’s S method (Scheffé, 1959) have shown that, experimentally, the former 
gives far more Type I errors than the latter. Sampling experiments are not 
required to demonstrate this, however, as exact probabilities may be calculated 
from Pearson's (1968) tables of the incomplete beta-function, as demonstrated 
by Gabriel (1964). 

Rodger (1965, 1967) has recently proposed a decision approach to multiple 
comparisons. He amends the Scheffé S and Tukey T methods, which are 
based on an experimental error rate, in order to obtain a decision error rate, 
which he feels is more appropriate in psychological experiments. It is shown 
in this paper that the amended Scheffé procedure does not have the decision error 
rate claimed for it and the same is clearly true for the Tukey procedure. Ап 
error in the derivation of the amended procedure is pointed out. 

It is suggested that the ‘lack of sensitivity’ objected to in experimental 
error rate procedures is brought about by the use of conventional 5 or 1 per cent 
significance levels. In many experimental situations when the null hypothesis is 
known a priori to be false, it is appropriate to increase substantially the experi- 
mental error rate above these conventional levels. 


2. ALL CONTRASTS 


Consider for simplicity a one-way classification with & normal populations 
with means иу (j = 1, ..., k) and common variance о? Random samples of size n 
are available from each population, the sample means being 4). Let 5? be the 
within-sample mean square calculated in the usual way, based on у= (п – 1) 
degrees of freedom. 
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k 
A contrast of the means шу is defined by У Сұм), where the Cy аге 
ізі 


Li . 
constants such that X) Су=0. А comparison of the means is a contrast with 
ізі | 
Cy=1, Су=—1, Cu=0, 1%) or j. То test the null hypothesis 
k е . 5 
Hi: X Cup;-0 that a single contrast of the means is equal to zero against a 


dE = EP 
general alternative, the test statistic is 


E k 4 
10 = gn >} ТЫҢ У с) Я 
jel jel 
Under the null hypothesis, 29 has a 7, distribution. А level а test is then: 

reject the contrast hypothesis if 


120 |» t, 


where t,* is the upper 100a per cent point of the 1, distribution. 

Rodger (1967) has suggested that all contrasts among the k-means (not 
merely all comparisons of pairs) should be tested in exactly the same way by an 
x-level t, test, in that the experimenter’s interest is in the individual contrasts 
of means, not in an overall test that all contrasts are zero, for this is generally 
known a priori to be false. The use of a-level t-tests for all contrasts is generally 
associated with Fisher, and called either the LSD test if a preliminary F-test is 
significant at level о, or the multiple t-test if no preliminary F- 
(Duncan 1955). "These tests were usually used only for comparisons. 

It is well known that the use of multiple t-tests on all comparisons (or more 
generally all contrasts) will result in a much greater probability than « of 
incorrectly rejecting at least one of the contrast hypotheses. Scheffé’s S method 
was developed to make this probability—the experimental error rate—exactly y, 


for any specified Y. 'To achieve an experimental error rate of y, any contrast 
hypothesis H; is rejected if 


test is used 


I|» [(— 1) P; Lp. 

Sampling experiments have been carried 

vice versa (among other aims). However, the relation between y and « can be 
obtained directly from tables of the incomplete beta-function as demonstrated 
by Gabriel (1964). Consider two examples from the sampling study by 
Petrinovich & Hardyck (1969). They found, for k=3, n—5, v=12, а= 0:05, 
that a 5 per cent level t-test on each comparison (only the hse comparisons of 
means were considered) resulted in an experimental error rate for the t-test 
procedure of about у = 0:075 (reading from their Fig. 2). The empirical size of 
the t-tests (averaged over the three tests) was, however, only about 0-032 instead 
of the nominal 0-05. Оп the other hand, a Scheffé 5 per cent experimental 
procedure (i.e. y = 0:05) had a‘ comparison’ error rate of а = 0-013 approximately 
(from their Fig. 1), i.e. the average size of the three separate t-tests was 0-013. 


Out to estimate у given о, and 
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These results are only sampling approximations; the exact probabilities may be 
obtained as follows. 
For а= 0-05, the critical value is /09? —2-18. If all contrasts, or a subset of 
them, are tested against this value, from the Scheffé point of view this means that 
(2Fz, 12)! 2218 
or Fy із-2:376. Тһе corresponding value of y is not tabulated in the usual 
F-tables, but y may be obtained from Pearson's tables (1968) by making the 
substitution 
£- (14-4 Fly) 


when 


In the above example, v, =2, уҙ-12, Ғ-2:376, £—0:716, у-0:135. Thus a 
5 per cent level t-test on all contrasts corresponds to a 13:5 per cent level Scheffé 
S procedure, and in particular corresponds to an overall 13-5 per cent level F-test 
for the equality of the three group means. Тһе agreement of the sampling 
estimate is rather poor. 

For the second example, if у = 0:05, then F3, ,,—3:89. Тһе critical value 
for all contrasts is then (2729)/2--2.789. Thus from the multiple t-test 
point of view, t4 = 2:789. "Тһе corresponding value of а may be obtained as 


above, using 
L3 v1 
ess. v1). 


‘Thus for v=12, 12:789, £—0-607, «=0-0164. The sampling estimate of 
0-013 is in reasonable agreement. 
In general, у and a may be found from the relations 


у=Р[Е, 1, > Fi. (k-1)] 


and 
a=P[F, „> (k- 1)F,. 1.41. 

Rodger (1967) has recently proposed the decision error rate approach for 
both the orthogonal contrast and all contrast cases. | * Decision ' in this context 
is essentially the same as © contrast ', as each ' decision ' is carried out at a level 
x on a contrast of interest. In the orthogonal contrast case, he notes that there 
are only Ё—1 orthogonal contrasts, and if v= 0, the experimental error rate is 
given by y-1-(- ауы, this result holding approximately for large v (for 
the standardized contrast statistics have a common denominator and are 
statistically independent only if у= оо). Не also states that the same relation 
will hold in the all-contrasts case, so that the testing of all contrasts with an 
«-level 1, test corresponds to testing the overall hypothesis of k equal group 
means using Pj 4,» at the level y'21—(1—2)*' (presumably for large v, 
though the tables provided in Rodger (1965) go down to v— 1). 
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It is easily seen numerically that this relation does not hold. In the first 
example above, with k=3, v=12, «=0-05, it was found that y=0-135. But 
y'-1—(1—2)*-0-0975 +y. Thus the overall y’ used by Rodger will result 
in a decision error rate much less than «—0-05. Іп fact, F,,—284 from 
Rodger's table (1965, p. 144), so that #7, = (2F;',.)2— 2-383, whence a’ —0-0346 
instead of 0-05. Thus the values of percentage points tabulated in Rodger will 
lead to over-conservative tests of the individual contrasts. 

Тһе statistical error in Rodger's argument arises from a confusion between 
the truth of a hypothesis and its acceptance. Since there are only k— 1 ortho- 
gonal contrasts of the Ё group means, any other contrast must be a linear function 
of the orthogonal set, and indeed this is true for any k— 1 contrasts of full rank. 
Thus in the three-group example, опе set of two contrasts of full rank is 

K,—-pu,—pu, Ky=p.—p 3. Then Ку-ш-из-К,--К,, and similarly for any 
other contrasts. Rodger (1967) then argues (p. 58) that if k—1 orthogonal 
contrasts have been accepted, so must all the others since they are ' perfectly 
predictable ’ from the orthogonal set. The same argument is used, by implica- 
tion, for any set of &— 1 contrasts of full rank. But this is clearly incorrect: 
if in the above example 4, = + ô, %= 0, 3, = — $, then the sample values of the 
contrasts are &,— б, k,— 8 and k,—28, so that for certain sample sizes K,=0 
will be rejected while K,=0, К,-0 are not rejected. Indeed Rodger recognizes 
this explicitly (p. 57) when discussing ' intransitive ' decisions. Exactly the same 


argument is used in establishing a decision procedure using Tukey’s Studentized 
range, which is therefore also incorrect. 


Note added in proof. А referee has pointed out a further error. Rodger 
(1965, pp. че Жу [е derivation of the relation у-1-(1-ау)1, says, 
essentially, that if B, and B, are two non-independent events, then P(B, and 
B,)>P(B,) P(B,). This i P 52. 


| l S incorrect, as can be seen from a die-tossing 
example in which B, is the event that the number uppermost is even, and B, 


is the event that the number uppermost does not exceed 3. Then 
Р(В)-Р(В)-1, but P(B, and В,- $< P(B,)P(B,). 


3. SOME CONTRASTS 
It is well known that Scheffé’s (or other) all-contrasts procedure is much 
less powerful than an orthogonal contrasts procedure, if both have the same 
experimental error rate. Using the above three-group example with k=3, 
€ with y=0-05 rejects any contrast hypothesis if its 


statistic |00 |> 3-89. If only two orthogonal contrasts are to be tested, this can 
be done with the same experimental error 


rate using th i aximum 
modulus (Miller, 1966). From Miller’s table (p. CN eee geo of 
the maximum modulus for &—3, y= 12 is 2-77, However orthogonal contrast 
procedures may not detect large contrasts present in the ‘data if these are not 
members of the orthogonal set, or highly correlated with them. 

It would be desirable to have a contrast procedure which allowed the 
testing of a specified finite set ofr k — 1 contrasts at a given level х (ог y). From 
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the decision point of view there is no difficulty, Опе simply carries out the 
finite number of specified ¢ , tests, each at level a, However, the corresponding 
experimental error rate is unobtainable in this case, as it requires the probability 
integral of the singular multivariate ¢-distribution (see, for example, Gupta, 
1963). Ап upper bound can be found using Bonferroni’s inequality as in Miller. 
Let A; be the event that the ith contrast hypothesis is incorrectly rejected 
(—1, ..., r); then P(4j)) —a. From Bonferroni's inequality 


y — P (at least one A; occurs) 


=P(U A) 


ізі 


r 
< X P(Ai) 


ізі 
=ra. 


Thus the experimental error rate will not exceed ra, or conversely, if y is 
given, the decision error rate is at least y/r. If r and « are both large this 
bound will not be of much use, but if r is not more than 5 and х not more than 
0:01, it can be better than the Scheffé procedure. 

' Suppose, in the two examples given above, we wish to test just the three 
comparisons of the means, and no other contrasts. For y —0-135 (corresponding 
to а = 0:05 on all contrasts), Bonferroni's inequality gives а > 0:045, i.e. the three 
conparisons would have to be tested using «=0-045 to ensure that y < 0-135. 
But «=0:05 on all contrasts gives y=0-135, so the Bonferroni bound is of no 
use. On the other hand, if y=0-05, so that a=0-0164 on all contrasts, the 
inequality gives «> 0-0167, a very slight improvement. _ Тһе critical value for 
the three comparisons would then be 2-780 instead of 2-789 for all contrasts. 


4. CHOICE or Error КАТЕ 


The examples given previously have used either а = 0:05 ог у= 0:05. 
Miller argues that simultaneous procedures are appropriate when a high level of 
protection is required for the overall null hypothesis, implying that у should be 
taken as one of the ‘ conventional ' levels. Petrinovich & Hardyck (1969) state 
that the ‘use of ... pooled-variance /-tests ... produces Type I error rates 
experimentwise far in excess of currently acceptable standards B It has already 
been noted that Rodger rejects the use of y=0-05 as irrelevant in many experi- 
mental situations, as the null hypothesis is already known to be false, the groups 
being combined in the one experiment only to provide amore reliable estimate of 
o?. Gabriel (1964) takes a similar view: ' Experimenters are often far more 
interested in pairwise comparisons than in overall tests, and have only reluctantly 
been persuaded by statisticians to use F-tests instead of an array of pairwise 
t-tests. But if the experimenter's interest is in the subsets, the F-test should be 
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regarded merely as a preliminary to see whether any (contrast) might € 
cant... Thus ... the experimenter may wish to fix [a ]=0-01 for pin y 
parisons and accordingly p [v] ٣ as the level for the overa 

ing to an example with k=8, v — 40). | 
ee 9 жй that the choice of у should depend on the experimental 
conditions. If it is already known that all groups do not have the same poen 
a Туре I error experimental of у —0-5 might be viewed with equanimity. E 
versely, if for such experiments it is desired to set «0-05, then Y ы UN 
calculated explicitly. А large value of might also be appropriate in the жый 
stages of some exploratory investigations with pilot samples. Оп the other hé 


: н ; ‚ serious, 
in confirmatory studies where the consequences of ‘Type I error smight be ser! 
y might be taken smaller than 0-01. 
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EXPERIMENTATION BY COMPUTER: SOME PRELIMINARY STEPS 


Ву С. WOLFENDALE 


University of Nottingham 


A computer system is described which carries out on-line experiments 
involving m-independent variables and опе dependent variable. Using a search 
procedure that applies hillclimbing techniques, the system selectively places 
points in regions that demand them for reliable description. Some results of its 
performance are given in Section 3. 'The experimentation system leaves 
information in store and this is used by a model-assessment procedure in order to 
achieve the optimum predictive efficiency for each of the functional models under 


consideration. 
This study is part of a research project aimed at producing computer- 


controlled experimentation which will not only assess functional models, but also 
generate them, assess and generate discrete or ' structural’ models, and design 
and carry out experiments whose results will be essential for discrimination 
between models. "Тһе work is also intended as a contribution to the problem 


of computer program semantics. 


1. INTRODUCTION 
the first stages of an attempt to develop the 
on from one of simple executive 
cisions are made, involving what 


1.1. This paper summarizes 
process of computer-controlled experimentati 


control to one in which quite sophisticated de 
may be described as ‘creative’ operations. In essence the total scheme is 


conce ved as a recursive one in that, when complete, it will contain the seeds 
which are necessary and sufficient for its own development, so that after 


initialization it will progress by cation rather than one in 


oncerning experimentation 
bles which are either 


a process of self-appli 
which operations are dependent upon human control. This final goal can only 
be achieved by a gradual development, starting from the simplest experimental 
situations and relevant computer sub-procedures and progressing by very small 
steps. This is because the present state of computer science is such that m 
of the seemingly elementary problems concerning strategies of m 
control, * model’ assessment, * model ' generation and so on (all of whic zs 
essential for the achievement of the above goal) are either unsolved or m 
Basically, the system will be a realization of Popper (1959) eui ч ТА 
Scientific process except that his ideas are necessarily made more pre y 
need to translate them into a computer system. 
" First, it is necessary to consider H gne ber 
that invol -i nt and a-depende ) 
absolutely eerie ie or pseudo-discrete (i.e. continuous but coded 
into a discrete form before arrival at the computer) 
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Experimental data can usually only be described in terms of deterministic ` 
or ‘ actual’ components plus perturbations due to uncontrolled, unknown or 
uncontrollable variables. Different combinations of variable-types, ' deter- 
ministic' components and perturbations will demand different experimental 
techniques so that the system to be implemented must be responsive to these 
features to enable selection of the appropriate design. 

Тһе simplest case, which happens also to be a common one, is one in which 
there are m-independent variables and only one dependent variable, all of which 
are continuous. 

For more complex experimental designs such a description will be inade- 
quate as it stands, but when incorporated into a more adequate description it 
will often be found that it remains unchanged, except that decision procedures 


for selection of the necessary parameters for its implementation must 


also be 
included in the total description. 


For example, in a study of the effects of 
signal intensity, pay-off matrix values, and loss function for time to detection, 


upon decision times and the probability of detection, the situation can be con- 
sidered as interdependent sub-studies in which: (а) the dependent variable is 
continuous (decision times) and there are three continuous independent variables 
(intensity, pay-off, time cost); (b)there is a discrete selection amongst experiments 
of type (a) with the variables (parameters) being (i) the number of signal inten- 
sities, (ii) the number of response categories, etc.; and (c) there is an interaction 
between a discrete variate (probability of detection) and functional relations which 


are found in (а) and (b) above. Such a complex experiment will be referred to 
as a confounded situation. 


In order to realize such a study one must first choose the values of sub-study 


(b), and then carry out a series of experiments of the kind described in (a). А 
further choice is made in sub-study (0) and a new series of experiments is defined 
in (a) and so on. Sub-study (c) becomes relevant only after (а) and (0) have 
been in operation. 

Since many experimental situati 
are confounded, it seemed reasona 
described in (a) as a first stage in the 


ons can be thus analysed, even though they 
ble to consider experiments of the form 
computer implementation. 


intervals (basic factorial design. Н 
comings, the most serious of which is the ad hoc 
independent variables. ‘These point 
function under study and therefore p 
whose accuracy is a matter of luck ra 


procedure has many short- 
selection of values for the 
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y. The actual functional relationship is shown along with the description that 
would have been obtained by joining the found points by straight line segments. 
As can be seen, much of the data is redundant and misses the area that is most 
important for an accurate description of the actual function. 

Fig. 15 shows how those eight points (two of them define the range) could 
have been distributed so as to lead to a much more precise description of the 
actual function. This would have required a function-dependent procedure, 
however, which would involve a sequential assessment of properties of the 


function under study in order to determine the placement of the next point. 


x 


(b) 


FIGURE 1. (а) Function description for equal interval design; (b) description of same 
function for function-dependent design. ©, points selected by the respective 
designs; actual but unknown function; , straight-line empirical 


description of functions. 


S.P. N 
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In order that the Computer system would be as independent of operator 


control as possible and also function-dependent, it was decided to introduce into 


techniques. These techniques search for maxima (minima) of functions, 1. 
at first does not seem to be the solution to the Problem. However, as will be 


shown later, some of these techniques behave in the way required as a by- 
product of their search 


In general, the hillclimbing process can be described as follows, Suppose 
that we are given a black box with inputs, 


Х-(х,..., Xm) and ап output O(X). 
We wish to find the maximum (minimum 


[ 1 he true peak (or valley) is surrounded by a 
relatively large region of little functional relationship we find that the hillclimbing 
igibly from Most points, but if the point happens to 

i i i “leap ’ out of the region 


» OF more or less по change at all. 
the * mesa ’ Phenomenon Minsky, 1961). 
The more the functions un i 


Р 8 techniques, However, I consider that these 
techniques will form ап essenti i the overall scheme proposed in 
Section 1.1. Ав Minsky (1961) says: « I doubt that in an 3 
e.g. hillclimbing, will we fnd the means to build an effici 
solving machine. Probably an intelligent machine wil] require a variety of 
different mechanisms. These "p ; 
complex, perhaps recursive Structures, 
forward hillclimbing on one level ma 
sudden jumps of * insight > 


And Perhaps what amounts to straight- 
(on a lower level) as the 
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The techniques of hillclimbing have an extensive range, and therefore, in 
order to reach a decision as to the exact nature of the implemented process, the 
following points had to be taken into consideration: 

(a) The data that would be obtained from experiments would, in general, 
have a deterministic component and a probabilistic or ‘noisy’ component. 
Hence the field from which selection could be made was that of stochastic 
hillclimbing. 

(b) The process was not being used merely to find maxima or minima but 
was being used to describe an empirical function. 

(c) The system had to be one which was independent of the stage that it 
had reached in search, as defined by the number of readings, but was dependent 
umber of times that the estimated gradient had changed sign during 


on the n 
search. If the number of changes in sign reduced step-size and local exploration 
regions, then the hillclimbing process would be such that it would place few 


points in regions where the function is changing its gradient minimally, and 
place relatively more points in regions of high changes of gradient, especially 
where gradient is changing in sign. 

(d) Since the process was to be used on-line, it had to involve the minimum 
This is especially critical as the number of independent variables 


of calculation. 
volved only a linear increase in com- 


increases. А process was required that in 
plexity as the number of independent variables increased. 


(е) АП hillclimbing processes progress sequentially over their searched area. 
'This means that if such a process were to be used on-line, the subject would 
pick up and respond to the obvious sequential effects. In order to counteract 
this it was necessary to invent strategies of randomization that would not disrupt 
the hillclimbing process but which would present a sequence to the subject that 


was in effect random. 


1.3. The Formal Bases of Hillclimbing Techniques and the Rationale for the 
Process Adopted 


Dvoretzky (1956) formulated a set of weak conditions that a stochastic 
hillclimbing process must satisfy in order to ensure that it eventually reaches 
within + of the true maximum or minimum in finite time, with probability one. 
If we consider one hillclimbing process (Kiefer & Wolfowitz, 1952), whose 


movement is defined recursively by the equation: 
n 
n1 Е Ien ШЕЛ сп) Қ — cn)], 


elected point and f is the function that is being searched. 
out the noise in the function under study, we must have 


о fay\ 2 
У (- < oo. 
п=1 \ r 


where Xn41 is the newly s 
Then in order to average 
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In order that the Process does not converge before reaching the maximum or 
minimum we require that 


a 
У а= о 
n=] 
But to prevent oscillation and to enable a ‘ settling down’ we also demand 
lim s —() 
nso 
lim c, =() 
no 
Тһе above аге called Dvoretzky conditions ’, 'The simplest formula for ap is 
given by the harmonic Series: 
45 — 1/я, 
Since 
lim 1/n-0 
no 
and 
© 
X dnz о 
nel 
In order to Satisfy the Condition 
© а. 2 
nel \ Cn, 
we can take 
Cn = (ay 14 
so that 


quired, Hence the Kiefer-Wolfowitz Process, with 
Dvoretzky conditions, 

с г 4)), easy to cal lat irement 

(4)), and seeks out maxima and mini : ; d Bot Y dp 


process). However, it is only appropriate f. 
decelerates аз a function of 7, th 


of changes in sign of gradient) 


Ku а цаа д (Ха ру)... Fa e) жыл. а) J, 
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where 
Xs (хі ee, Xm)n = (Xin; ee, Хтп) 
апа 


Xp ten = (xim Xon, <“, Xin E Cn, -~ Xmn)- 


In order to make the system independent of stage but dependent upon change 
in gradient-sign, we need merely to replace n by N, the number of changes of 
sign of gradient during search. "This was proposed by Kesten (1958) for the 
Kiefer-Wolfowitz process. Hence, finally, the process can be described by 
the equation: 


Х,ы=Ха+ zh QU, ex) -KX m ex) b s (F(X с) Й" см), 


where N is the number of changes in sign of gradient during the search in length 
п. When z—1, N is given the value N=1. 

Kesten shows that the so-called accelerated process converges with proba- 
bility one when cy is a constant. From the implementation point of view, 
however, adopting such a restriction on cy would have produced a system that 
relied too heavily on the operator's choice of the constant, whereas if cn = (an)? 
as before, the values of c, are generated by the system itself. Since the imple- 
mented scheme terminates search when step-size has been reduced to below a 
certain value (which defines the precision of function description), and is there- 
fore not an exhaustive search, the Kesten restriction does not apply. 

Тһе above system satisfies all the requirements listed in Section 1.2 except 
that of randomization, which will be dealt with below. 

One of the faults of hillclimbing processes in general mentioned in Section 
1.2 was that they tended to seek out only local maxima and minima. Since the 
aim of this work is to produce a system that describes a function adequately, this 
ult is extremely useful, in that no matter where we place the initial 
point of search we can be sure that the system will home in on its local maximum 
or minimum (if they are within its discriminative power), so that at the end of 
search we can be sure that there are no discriminable maxima or minima within 
the searched region apart from the ones that has been detected. If this region 
is now closed from further search, and another point is taken in the region that 
is still to be explored (within the initial range of values for the independent 
variables), then if another hillclimb is effected, it will detect further discriminable 
maxima or minima Or complete the total range of study if there are no others, 
With certain protective restrictions on the process it is therefore possible for it 
to be used to pick out all discriminable regions of change in gradient sign 

within a total range off experimental study, and place a 


(maxima and minima) па Я : 
higher density of points within these most important regions (a product of 


Kesten acceleration). 


particular fa 
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izati i it i 2 7 ain 
As for randomization, I have Introduced it in the process at two m 
points: 


(5) Within the hillclimb there is randomization as to whether the next trial 


Search for a minimum. Тһе 
climb is only completed sequentially when Опе or the other criteria have been 
reached. This is made more explicit in the next < 


Process stops, with information available for a higher- 
to use as required, This means tha 


The main Procedures are spa 
service procedures: 


(а) RANDOM. A machine code body implementation of the algorithm given 
by Lehmer (1951) for Senerating random bit Patterns in a computer word. This 
is used to Benerate а random digit 7 (0<:< 9), апа leave it in Store to be used by 
later procedures, 


(6) RANDY. This uses M but generates M real values 
(positive or negative) up to as Many significant decimal Places as are required. 
The placement of the decimal point and the number of Significant places are 
specified in the Parameter vector of the procedure. The variate so generated 
is rectangularly distributed in the range, 

—999...999 x 10# to 


where the number of 9’s and the values of А ar 
(c) RANK. This Procedure takes items i 


for speed, places them in increasing order i 
the original array, 


RCH and HILLCLIMB, which use the following 


Procedure RANDO; 


n ап array and, using a bubble-sort 
without disrupting 


(d) FUNC. This is used to simulate the 


and Coefficients specified by the 
Parameter vector, 
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TO 
TERMINATE 


CALL RANK FOR HIGH 
AND LOW BOUNDS 

ORDER AVAILABLE 
SEARCH AREAS, О|!] 


PROCEDURE 
RANK 


TO 
TERMINATE 


AVAILABLE 
AREA « MINSUM 
2 


FIND RANDOM POINT 
О<У5<1 FOR 
J:=1, 1, STAGE 


PROCEDURE 
STANDR 


EFFECT A HILLCLIMB 
BETWEEN 
LOW BOUND = RANK (J] 
HIGH BOUND = RANK[)-I] 
Add new low and high bound 


PROCEDURE 
HILLCLIMB 


TERMINATE 
WITH RELEVANT 
MESSAGES AND DATA 


2. Flow-chart of procedure SEARCH, simplified to one independent variable for 
— clarity of representation. 
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T 1 0 
(e) STANDR. Uses RANDOM and RANDV to convert the output of RANDY int 
a rectangular distribution in range, 0<7<1. 


(f) NORMAL. Transforms the output from RANDV into a normal variate 


array NORM(?), and calculates the mean, standard deviation and sums of үеге 
of the generated values. Тһе output from this routine has been tested exter 
sively and its normality is excellent. 

We now come to the main procedures, 


SEARCH. Тһе range(s) of the independent variable(s) under study is (are) 
specified in the parameter Vector. А random point is chosen within the range 
and HILLCLIMB is called оп. When HILLCLIMB has finished a lower bound andan 
upper bound for the climb are generated for all independent variables, thus 
defining a closed region. The SEARCH Procedure now ranks all the lower and 
upper bounds in increasing order, and then a random point in the Space still 
to be explored is selected, with a probability weighted according to the size of 
the space involved. 


The search is limited by the specification in its parameter vector of the 


; (6) a maximum number of 


If there is zero (or very little) 
functional relationship over the total range of study, 


A flow-diagram of SEARCH is shown in Fig. 2 for one d 
dent variable. The case of m-independent variables 


principles described in this diagram, but to include this would have produced 
a more confusing flow-chart. SEARCH is in fa 


HILLCLIMB. When SEARCH has found a new 


in order to minimize a subject's 
response to sequential effects. HILLCLIMB se 


‘climb’. These are initialized at +1. When the foll 
relevant markers are set to — 1: duced to below a certain 
value before reaching an existing boundary, Because of Kesten acceleration, 


this indicates either a peak or a valley, (b) The next reading is outside existing 
boundaries. When both markers are equal то — 


owing events occur, the 


HILLCLIMB will also 


1 : ded a specified value. HILLCLIMB 
is shown in flow-diagrammatic form in Figs. За, b. 


3. Some TESTS ор THE SYSTEM 

In order to test the System, procedures FUN 
simulate a subject’s behaviour, Clearly it is esse 
of the system in known functional situations and ink 
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START 
INITIALIZE 
PARAMETERS 


FROM 
SEARCH 


REPEAT: 


TERMINATE 


STANDR> ! 


PROCEDURE 
OR P MARK = 
? 


STANDR 


VSCH: 


XPI > HILIMIT 
OR 


XPI <LOLIMIT 
? 


ур АРМ (ХР1+©Р) 
xP2-XPI + Cp | — (XPI — CP) 


for all dimensions 


IS SIGN 

OF GRADIENT 

UNCHANGED 
2 


PN-PN+1 
APN: =E/PN 


READJUST HIGH AND 
LOW BOUNDARIES OF 
SEARCHED REGION 


FIGURE 3a. Procedure HILLCLIMB. 
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ARE 

LIMITS 

EXCEEDED By 

XP2 
? 


TO REPEAT 


HAS 
STEPSIZE REDUCED 
BELOW ETA 

2 


OUTPUT CURRENT 
RESULT: 
XPI--XP2 


SAME AS PSCH EXCEPT 
(1) V replaces Р in identifiers 


AVN 
XV-XVI 4 ic 


i-e 


OUTPUT NECESSARY 
INFORMATION 
NEW CLOSED REGIONS 
DEFINED BY BOUNDS 


TERMINATE 


^ 
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First of all it was studied in conditions of little ‘ noise ’ in order to see how it per- 
formed for a variety of functions. Its performance was satisfactory in that it 
placed experimental readings in regions that needed them for an accurate 
description, with a greater density than for regions that needed fewer. Тһе 
next test was to assess its stability as noise increased. "Тһе function selected was 
f(x) =8 . sin(x) under two conditions of noise; one with a standard deviation 
of unity and the other with a standard deviation of 9. An equal number of 
readings per explored point was selected for both noise values so that the 
standard errors also had the ratio 9/1. Fig. 4 shows the results for the noise 
with a standard deviation of 1. Ав can be seen, the maxima and minima have 

been pin-pointed accurately, with a higher density of readings about them. 

The term ‘ stage’ refers to the selection of a new region by SEARCH in order to 

effect a HILLCLIMB. 


TABLE 1. PERFORMANCE OF THE SYSTEM FOR FUNCTION 8. siN(x) UNDER Two 
CONDITIONS OF NORMAL NOISE (s.p. — 1 AND S.D. —9) 


No. of No. of 


S.D. stages trials 
1 7 35 
9 10 41 

Stage 


f(x) 


+10 


+5 


A 
l 
і 
І 
І 
І 
| 
! 
1 
| 
П 
! 
{ 
| 
! 
І 
1 
1 
' 
U 


+8 +10 x 


AU B 6 -2 0 92 & 46 


exploration by the system of f(x) = 8 . sin(x) which is perturbated 


М Its of An 
кшш Cier 1 noise of zero mean and standard deviation — 1. 


by,norma 
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Fig. 5 shows the same function perturbated by noise of standard deviation 
equal to 9. Although there is a ninefold increase in noise there is only a slight 
change in the number of stages and the total number of readings. "Table 1 
summarizes the results for the two cases. Thus the system gives an impression 
of being relatively stable under noisy conditions. Tests with other functions 
(polynomials of differing degrees and higher order Fourier series) and other noise 
values have been made, and the system seems very stable. 

Stage 


f(x) 4,6 9 1 10 


+10 


+5 


1 
I 
1 
l 
| 
[ 
І 
І 
[ 
П 
| 
| 
| 
1 
! 
1 
І 
i 
! 


-10 


-10 -8 -6 -% -2 0 42 46 


+6 +8 +10 x 
Ficure 5. The results of exploration by the system of f(x)=8 . sin 


i (x) which is pertu 
by normal noise of zero mean and standard devia р rbated 


tion — 9, 

Тһе system avoids the problem of zero movement in areas of small func- 
tional relationship by assessing at a given stage that very little Progress is goin 
to be made in the region under study. The stage terminates and SEARCH ai 2 
another unsearched region. When а study was made of f(x)=0 with 4 dcus. , 
unit standard deviation, it was found that the system took л ^ be eh e o 
points at random in the total range of study and then stopped case) 


Tests to date indicate that it may be a viable and robust system for on-line 
experimentation. 


4. THE ASSESSMENT OF MATHEMATICAL MODELS AFTER OR DURING ON-LINE 

EXPERIMENTATION 

In an experimental situation we are commonly interested in (a) efficient and 
accurate description of the empirical functions under study, and (b) the assess 

of a set of models which are predictive in the experimental Situation засы 
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consideration. Of course these are not independent, since accuracy of model 
prediction will be confounded with accuracy of functional description. "Тһе 
problem in (а) has so far been considered to some extent in this paper, and now 
(b) will be discussed. 

We usually assess models by how well they do in predicting the data that 
are found in an experiment (although this is not always the best criterion). 
Hence a system has been developed which does just this for mathematical models. 
For each model there will be а set of unknown parameters (m, say) and the 
system finds that set of parameter values which optimizes the predictive efficiency 
of that model. At the end of this process, therefore, it is possible to assess the 
models on the basis of the best that they can do in predicting the observed data. 

It is fortunate that the process of finding the optimum set of parameter 
values for a model can be described as a hillclimbing process which seeks out 
the overall minimum value of a criterion function which is sensitive to the 
difference between the predicted function and the observed one. Since SEARCH 
and HILLCLIMB processes leave all explored values of the independent and 
dependent variables in an array in store, it is a simple matter to trigger off the 
MODEL ASSESSMENT procedure at the end of experimentation or even during it. 
Since MODEL ASSESSMENT is basically the same as SEARCH + HILLCLIMB except that 
it operates on а data array, has a set of options as to the criterion function that 
has to be minimized, and contains (as a sub-procedure) the set of models to be 
assessed, it is not necessary to describe it in any further detail, except to indicate 
the types of criteria that are used. ‘These are 

(a) x(X-Yy (least squares), 

(b) EX- Y)/Y}? (fractional least squares), 

(c) Xilog( Y + «) - log(X + 9 (log least squares). 
Here X represents empirical data and Y represents the corresponding predicted 
value from a model. «isa useful attenuator which allows one to alter the 
sensitivity of criterion (c) to small values of X and Y. This has been especially 
useful in the assessment of models for distribution functions whose values only 
range between 0 and 1. : 

When the model-assessment procedure was applied to the results of the 
experiment in Section 3 (see Figs. 4—5) it was found that the optimum fit for a 
general Fourier model was given by: 

(a) f(x)= 8. sin(x) for the case where the noise had unit standard deviation. 
This model accounted for 99-28% of the variability in the data, and is the correct 
function. И 

(6) Ах) = 7-92 . sin(0-98x) — 0-02 for the case where the noise had a standard 
deviation equal to 9. This model accounted for 74-62% of the variability in the 
data. Surprisingly, even with such an increase in noise the optimum model fit 


is so close to the true function. | | 
Other models (exponential functions, polynomials) produced fits which were 


inferior to the ones given above. 
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5. DISCUSSION AND SURVEY 
In the previous sections І have described two main systems: 


(a) An on-line experimentation system, which explores m-independent 
variables in such a way as to optimally describe the dependent variable function 
of them. "Тһе number and placement of readings depends upon the deter- 
ministic function involved and the noiselevel. In general, however, it places its 
experimental points with a greater density in regions where gradient ‘ sign’ is 
changing and with lower density where gradient is relatively constant. This 
is essential for reliable description. 


(b) A model-assessment system, which uses the information yielded by the 
above system, in order to assess the predictive ability of a set of functional 
models. For each model it homes in on the optimum set of parameter values 
which minimizes the difference between predictions from it and the actual data. 
In this way the models can be studied one by one, producing the best performance 
for each. This leaves information for a higher decision system which will reject 
or accept models on the basis of their performance. 

These systems are part of a total system that is being developed and which is 


described briefly in Section 1. ‘They are being studied more deeply at present, 
specific attention being paid to: 


(a) The stability of the systems under noisy conditions especially from the 
point of view of determining the optimum model fits. 


(b) The development of a comparison between the implemented on-line 
system and the ad hoc factorial design. What is being done is a Büdcelision' cf 
studies for a given function and for given noise, in which the number of factorial- 
design readings is found which gives the same accuracy of description as the 
on-line system. ‘This is really a research project in its own right, 


(с) The development of more operator-free model-assessment Systems 
which involve heuristics for selection of models and for decisions aS t5 the 
prognoses of climbs during the fitting of models. Interestingly, these will 
demand the implementation of hillclimbing systems which are self-applied i 
that they will carry out hillclimbs on their own parameters, P in 

Finally, it is to be noted that the subject's performance i 


i i E n the situati 
studied has been in fact simulated by the combination of a fun ions 


: ction generati 
procedure and a normal noise procedure. generation 

The system has therefore been attempting to assess models of the ре 
mance of a combination of its own routines. perfor- 


In the field of computer science much interest has been pai 
of program semantics (Cooper, 1967, 1968; McCarthy & Painter зе Problem 
others). In essence the above scheme is a pragmatic System for the ; and 
of models of the way a given procedure (program or piece of progra assessment 
performs. This means that the work described, will not only bes m) actually 
experimentation on human or living subjects in general, but willl “gi 
contribution to the problem of program semantics, U also be a 
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